Collaborative Machine Learning

Tuesday, April 28, 2009

I’m imagining a computer program running on a basic server cluster (read: Google) that’s able to learn stuff by asking random users random questions. It would be primed with some basic knowledge by the programmer, but every day it would find something interesting (i.e. random dictionary word) and probe its properties. The types of questions would initially be naïve (is an apple a noun or is to apple an action?), but gain depth as knowledge becomes available (how is an apple different from an orange?).

A simple way of sending the program to school would be to expose it on many websites where it could ask questions and accept answers from users. So… would this work?

6 Comments

  1. fishzle says:

    I read about something like this a couple of years ago. I tried to dig up my findings, but I think it predates my delicious.com usage… Hmm. I know that there were at least 2 academic outfits doing this sort of thing.

    BTW, you don’t need the cloud to do it, you can run it on a regular server. I mean the input is coming from users and search patterns, you just need some brains to write the questions, interpret and decide what the next question should be.

    Oh and you need to get lots of visitors and truthful answers.

  2. Jiang Yio says:

    True… though with large numbers of users the untruthful answers tend to be diluted.

  3. Boris says:

    Wisdom of crowds. The way you describe it I’m not sure if a machine learning algorithm is even required. Get enough responses and I’d imagine the more popular ones will tend to be right. You can also use people’s responses (action vs noun), assign some probability values and run a bayesian network. For example, if you know that something is a noun, sweet, orange in color, and a fruit, what is the probability of it being an orange? What about if you only know that it’s orange in color and a fruit? Or if it is definitely an orange and also a fruit, what’s the likelihood of it being sweet? But I’m not entirely sure how you intend for this to work and I am no expert by any means so these are just some ideas.

  4. Jiang Yio says:

    Hm, I’m not sure either but I’m just throwing ideas around hoping that someday I or somebody else might come up with something better reading this.

    Wisdom of crowds. The way you describe it I’m not sure if a machine learning algorithm is even required. Get enough responses and I’d imagine the more popular ones will tend to be right.

    Yeah: like Amazon’s mechanical Turk! It’s a nifty crowdsourcing platform.

    But I was thinking more along of the lines of actually incorporating this information into a database somehow. Now that would be difficult.

    Here’re some interesting ideas: http://www.110mb.com/forum/empty-t45737.0.html

  5. Boris says:

    Dude…imagine if we could build a robot that can move around on it’s own and receive input from visual, tactile and auditory stimuli. We could totally like program some basic concepts into it (some form of evidence if you wish), and then have it collect this data and analyze its properties and be able to classify it’s surroundings into categories and also differentiate them (what you were talking about), but also be able to determine relationships and dependencies among objects in a logical manner (what I was sort of suggesting with the probabilistic manner). Dude…it’s like the ultimate AI. It’s like creating a newborn and making it capable of learning at least to some extent. Now that’s an awesome (and extremely complex) project. LOL, add some chemicals and chemical sensors (maybe with time when a good working idea of how to do this comes about), and I wonder how this thing would stack up to a human?

  6. Jiang Yio says:

    This thing would PWN a human if it had access to Skynet :)