Developers, Developers, Developers! Maksim Sorokin IT Blog

23Mar/105

New Project On Data Mining

I am happy to announce, that I am starting working on my first research project for Copenhagen University. This project is about data mining on huge base of XML documents. Since we all like functional languages at our faculty, most part of it will be implemented using them. So I hope soon there will be a lot of posts about data mining and functional languages!

Comments (5) Trackbacks (0)
  1. Python + lxml тебе в помощь :)

  2. Hi Anton!
    If you don’t mind, I will respond in English.

    Anton offers to use Python+lxml. Well, the thing is that there is no essential difference in using one language or another. The goal of the project is a bit different. But since in our faculty we all like functional programming, I am using Haskell. At least now =)

  3. speed is essential than we are talking about huge amounts of data or it doesnt matter in this particular case?

  4. hmm, after a small research found that Haskel (in most cases) is faster than Python or equal =)

  5. Well, First, I don’t think, that Python with lxml would be much faster in this case. I am not using pure Haskell too. I have use some dedicated module for parsing. Beside this, speed is not that essential right now. Also, preprocessing scripts and first planned analysis logic are not that hard and can be easily rewritten later on.

    But anyhow, I will try to check the Python + lxml and compare the speed. Because I trust you, Anton =))


Leave a comment


Trackbacks are disabled.