I have aquired several very large files. Specifically, CVSs of 100+ GB.

I want to search for text in these files faster than manually running grep.

To do this, I need to index the files right? Would something like Aleph be good for this? It seems like the right tool…

https://github.com/alephdata/aleph

Any other tools for doing this?

  • yaroto98@lemmy.org
    link
    fedilink
    English
    arrow-up
    17
    ·
    2 months ago

    Done this with massive log files. Used perl and regex. That’s basically what the language was built for.

    But with CSVs? I’d throw them in a db with an index.

    • SheeEttin@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Agreed. If the data is suitable enough, there are plenty of tools to slurp a CSV into mariadb or whatever.