status update, 20240528: I will probably pause writing here for a while. originally, i kept this because i thought each short activation notes (haha 🤓) isn’t really worth both my and the reader’s effort to click into a new page. I still think that, it’s just I’d probably link to my twitter.


ai safety felt inherently obfuscating to me. of course, this field began by thinking about the unknowns (at the time), and there are many virtues and good outcomes with it. however, the most sustainable works have been the ones that 1. do not obfuscate further, and 2. clarify things and apply engineering principles on top of clarification efforts.

biopunk (@npmsolspar) told me that he met one of the cofounders of neuralink last month and he moved on to found another company (integrative neurotechnologies)—the trend is working in neuromodulation over bci now

need to be spending more money on evals: although this lesson is less obvious than the “data is gold” thing, I find myself agreeing with safety-pilled people on this (considering op was from anthropic): evals are for those who actually cares about the model, the same way I would have cared about my kids, if I have some one day.

From kepano:

  • To make text editing experience as seamless as image manipulation, language models need to be local to the device so that they be privateoffline and future-proof

it’s good practice to handle all enumeration values in a switch statement to avoid unexpected behavior.

2024-03-12

the best performing human not only interacts with the world more efficiently, has a great policy, and further more guides its actions not purely based on rewards, but some internal heuristics. This is partially the inspiration for recursive summarization paper, where the authors first used SFT to make the model capable enough in the first place.

2024-03-05

Today i found my problem: turning my intuitions into code is practically mt everest for me. it’s hard, and it’s bad.

2024-02-24

Not a lot of things are zero-sum. So, my problem that if someone else knows what I know, I tend to stop thinking about what I have been thinking and just “pivot”, is invariably unsustainable.

Dream career composition

”Enthusiasm matters!” — Nat Friedman

My dream work-life is 50% basic research, 50% applied engineering work, and 50% reading papers and books. I think I have done plenty of hopping around topics and am very close to finding the niche I am passionate about.

Benchmarks in measuring oversights

The Scalable Oversight paper wrote that

Alignment in this context is best defined by contrast with capability: We can say that a language-model-based system is capable of solving a task if it can be made to perform well on the task through some small-to-moderate intervention, such as fine-tuning or few-shot prompting with a moderate amount of high-quality task data, with the intuition that this shows that the model already has most of the skills and knowledge needed to succeed at the task. Such a system is misaligned if it is capable under this definition but performs poorly under naïve zero-shot prompting.

My problem with benchmarking under this direction can be framed as the question “are our benchmarks sufficient to measure model capability?” Going off on the direction of applying self-play methods, which increases model performance on benchmarks as much as their training data allows, can we continue increasing the difficulty of an arbitrary benchmark such that the model would score less?

I think to effectively go off on this, I’ll have to know the deal with self-play. It’s just two or three papers, how hard can they be?

Over and Underdrive when Researching

One interestingly detrimental attribute to my learning is that I don’t know how to hit “stop” to my thoughts when I’m intaking, but also don’t know how to hit “continue” when I’m producing. In car terms, I won’t hit the break when absorbing information, but also don’t even think about keep hitting gas when it’s my turn to write or produce something. Coding is a bit blurry because you can kinda argue it both ways. I need to intake something to output some code.

If I want to read about wide neural networks and I come across a slightest trace of optimization algorithms, I’m done for the period: an arbitrary paper link will lead me down a rabbit hole I don’t even know I’ll be in, and not realize it after it’s too late: I did not learn enough information relevant to my goal. This has been postponing my research work in multiple fronts. I’m literally fighting a 1v7 with 7 cheating AI in sc2 right now.

Update 2024-02-21

I do think these are tractably solvable problems if I am more mindful about it though. I have seem some improvements over the past month.

On writing

I used to write a lot everyday, but that has pasted when I resorted into thinking everything I am thinking is intuitive and has probably been said in better ways. As the implications of scaling law tells me, this is factually wrong. (I learned scaling laws in 2023). What is inter-personal communication, if not as modelable information exchanges? Even in the age of information abundance, if I want to transition into a new type of human, it is no longer my privilege to cruise along the pretense of passive data generation (by devices). Either I dominate my data pipeline and have a say in what models will know about me, or not.