Developers, Developers, Developers! Maksim Sorokin IT Blog


Memory Leak in Haskell During File Read

UPD.: Issue has been resolved

I used Haskell for doing certain operations on big set of XML files: parsing XML files, running regular expressions etc. Set is not that big, but still -- 90000 XML files totalling 2.7GB.
I faced memory leaks in every program that I wrote. It was quite annoying sinceĀ  Haskell was chosen as primary language for the project and I needed to produce some results relatively fast.

Today I decided to take closer look during what stage leak occurred. My initial assumption about leak during file read was confirmed. Initially I used readFile to read a file. But in this example I use Data.ByteString since it gracefully handles file close after opening.

Here is my example:

module SizeAnalyzer where

import Data.ByteString as B
import Utils

docsDir = "/home/mah/Documents/university/MIPH/data/updatedXml"

main =
   do filenames <- getFilenamesInDirectory docsDir
      acc <- test filenames 0
      return acc

test [] acc = return acc
test (fn:fns) acc =
   do contents <- B.readFile (docsDir++"/"++fn)
      test fns $ acc + (B.length contents)

Memory usage slowly grows. Even if program is finished, used memory is not freed!

Comments (2) Trackbacks (2)
  1. That’s because you’ve written your code to have a memory leak. Replace $ with $! and the problem will go away.

  2. Precisely!
    As you may see, there is “UPD” in the top, referring to the post where it was solved ;)

Leave a comment