Random stuff: Varozhka: Introduction (part 1)

As you may know, Netflix organized a competition for systems predicting user ratings for movies.

I'm sure a lot of bright people have ideas on how to improve that, but do not have time to spend on it.

This project is a framework to automate most of the dirty work with the dataset. So you can concentrate on the prediction algorithm ;)

Current features:

No additional DB engine is required. All indexes are loaded in memory.
Abstract layer to play with data (this is a place to plug in).
Data access layer.
Easy way to check RMSE against the probe set.
Generation of submission dataset.

So, basically you can download the Netflix dataset, extract it to a directory, start a wizard (which do all import tasks), implement your own rating estimator, and use a wizard to submit results to Netflix.

The project named Varozhka (belarusian word for "fortune-teller"). It hosted at Google Code and SourceForge.Net.

This is an introductory post about the project. More details later...

NOTE: The project is under development, and most of the code is not optimized in any way.

Technorati tags: netflix, netflixprize, netflix prize, varozhka, DIY

del.icio.us tags: netflix, netflixprize, netflix prize, varozhka, DIY

Labels: .net, netflix, netflix prize, netflixprize, varozhka

Random stuff

Friday, December 15, 2006

Varozhka: Introduction (part 1)

0 Comments:

Previous Posts