Drishti

202405221143
Status: #MOC
Tags: Drishti

Drishti

Papers

  1. TaskVine- Managing In-Cluster Storage for High-Throughput Data Intensive Workflows

Updates

  1. LCLS (Linac Coherent Light Source)
    • Unable to run. Not able to find documentation either
  2. BerkeleyGW
    • Able to run. Good documentation
    • Appears compute bound. Few inputs followed by lots of calculations
  3. CosmoFlow
    • AI based. No clear indication on how it is a workflow
    • Could be a good candidate with lot of IO
    • Python based workflow.
  4. Flash-IO
  5. RAMSES
  6. h5bench
    • Unsure how it constitutes a workflow, but promising and quick and easy
  7. Megatron-LM
    • LLM training example.
    • Promising usecase for Python.
    • Requires GPUs :/

Next Steps

  1. Find something which we can run as a series of scripts
    1. Make a list of workflow
    2. If any papers have any artifacts, let us try to run them
  2. https://wangchen.io/static/pub/papers/24-REXIO-PDC-Montage.pdf
  3. https://github.com/t3n0/qeflow
  4. https://pegasus.isi.edu/workflow_gallery/
  5. https://pegasus.isi.edu/documentation/examples/