I knew Hume.ai through an Avi Schiffman tweet. I think they are working on one of the most important tasks in AI research right now, which is to incorporate the unconventional human-centric modalities.

I applied to them a while back; the application asked why I was interested and I wrote more than a few words, so here it goes to my writings.


Deep neural networks learn to approximate function classes, Transformer models learn more powerfully so. The publicized success of learning from human preference has been primarily text-based, which wouldn’t be the stopping point yet for the above reasons.

Learning to predict human emotions is a large search and optimization space; I think solving this by learning from data of as many modalities and properties as possible will lead to the most important preference learning and alignment breakthrough in AI.

It excited me the first time I came across Hume’s website because what you guys are doing speaks my thoughts well. I have been thinking (maybe too much) about the implications of learning from non-text data. Hume’s approach from which I can see, which is learning from unconventional inputs like device signals, images, etc, aligns with my theory of a generalized learning problem that uses those data. Besides, solving this problem well will open the AI landscape to further amplify inter-human communication and understanding.

Followup

Some of my intuitions regarding improving learning by using multi-modality data can already be verified. Instead of sitting here and waiting for an update I might as well hack something. When I do, I’m 90% sure I’ll forget updating the backlink here and just throw down a new project writing.

Alternatively, there are much to write down about my entire “belief” around data and capabilities.