Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 28 additions & 27 deletions ipsuite/dynamics/md_nodes.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import contextlib
import logging
import sys
import typing as t
Expand Down Expand Up @@ -125,10 +126,8 @@ class ASEMD(zntrack.Node):
Output path for model-specific output files.
laufband_path : Path
Path to the job queue database file.
frames : list[ase.Atoms]
Property that returns all trajectory frames from saved files.
structures : list[list[ase.Atoms]]
Property that returns structures organized by simulation run.
frames : znh5md.IO
Property that returns a lazy IO handle for trajectory frames.
Comment on lines +129 to +130

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

frames attribute docstrings no longer match actual return type.

The docs state frames : znh5md.IO, but implementation returns znh5md.IO | list[ase.Atoms] depending on file count.

Also applies to: 394-395

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ipsuite/dynamics/md_nodes.py` around lines 129 - 130, Update the docstring
for the frames property in md_nodes.py to reflect the actual return type (it
returns either znh5md.IO or list[ase.Atoms] depending on number of files)
instead of the current single-type declaration; locate the frames
property/method and replace "frames : znh5md.IO" with a union/explicit
description like "frames : znh5md.IO | list[ase.Atoms]" (or equivalent phrase)
and apply the same correction to the other frames docstring occurrence in the
file referencing the same symbol so the docs match runtime behavior.


Examples
--------
Expand Down Expand Up @@ -167,26 +166,30 @@ class ASEMD(zntrack.Node):
model_outs: Path = zntrack.outs_path(zntrack.nwd / "model")
laufband_path: Path = zntrack.outs_path(zntrack.nwd / "laufband.sqlite")

@property
def frames(self) -> list[ase.Atoms]:
files = list(self.state.fs.glob((self.frames_path / "*.h5").as_posix()))
frames = []
for file in files:
with self.state.fs.open(file, "rb") as f:
with h5py.File(f) as file:
frames.extend(znh5md.IO(file_handle=file)[:])
return frames
def _make_file_factory(self, file_path: str) -> t.Callable:
@contextlib.contextmanager
def _factory() -> t.Generator[h5py.File, None, None]:
Comment on lines +169 to +171

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's examine the file structure and the specific lines mentioned
head -n 200 ipsuite/dynamics/md_nodes.py | tail -n 40

Repository: zincware/IPSuite

Length of output: 1698


🏁 Script executed:

#!/bin/bash
# Check the imports to verify h5py, contextlib, and typing usage
head -n 30 ipsuite/dynamics/md_nodes.py

Repository: zincware/IPSuite

Length of output: 676


🏁 Script executed:

#!/bin/bash
# Look specifically at lines around 169-180 to see the full context
sed -n '165,185p' ipsuite/dynamics/md_nodes.py

Repository: zincware/IPSuite

Length of output: 945


🏁 Script executed:

#!/bin/bash
# Check for Python version requirements
cat pyproject.toml | grep -A 5 "python"

Repository: zincware/IPSuite

Length of output: 478


🏁 Script executed:

#!/bin/bash
# Check for other ContextManager or Callable patterns in the codebase
rg 'ContextManager|contextmanager' ipsuite/ --type py -B 2 -A 2 | head -50

Repository: zincware/IPSuite

Length of output: 435


🏁 Script executed:

#!/bin/bash
# Check all underspecified Callable annotations
rg '->\s*t\.Callable(?!\[)' ipsuite/ --type py

Repository: zincware/IPSuite

Length of output: 464


🏁 Script executed:

#!/bin/bash
# Properly escape the regex pattern for finding underspecified Callable
rg 'Callable(?!\[)' ipsuite/ --type py | grep -v '\[' | grep 'Callable'

Repository: zincware/IPSuite

Length of output: 285


🏁 Script executed:

#!/bin/bash
# Check how contextlib types are used elsewhere in the project
rg 'AbstractContextManager|ContextManager' ipsuite/ --type py

Repository: zincware/IPSuite

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check if contextlib is imported elsewhere and how context managers are typed
rg -B 5 'contextlib' ipsuite/dynamics/md_nodes.py | head -30

Repository: zincware/IPSuite

Length of output: 387


🏁 Script executed:

#!/bin/bash
# Verify the typing of the decorated generator function
python3 << 'EOF'
import typing as t
import contextlib
import h5py

# Check what typing constructs are available in 3.11+
print("Available in typing module:")
print(f"  ContextManager: {hasattr(t, 'ContextManager')}")
print(f"  AbstractContextManager: {hasattr(t, 'AbstractContextManager')}")

# Check the correct annotation for a contextmanager-decorated function
from typing import get_type_hints

`@contextlib.contextmanager`
def example() -> t.Generator[str, None, None]:
    yield "test"

# The actual type of the decorated function
print(f"\nDecorated function returns: {type(example)}")
print(f"Callable signature needed: Callable[[], {t.ContextManager if hasattr(t, 'ContextManager') else t.AbstractContextManager}[YieldType]]")
EOF

Repository: zincware/IPSuite

Length of output: 176


🏁 Script executed:

#!/bin/bash
# Check what typing constructs are available in 3.11+
python3 << 'EOF'
import typing as t
import contextlib

# Check what typing constructs are available in 3.11+
print("Available in typing module:")
print(f"  ContextManager: {hasattr(t, 'ContextManager')}")
print(f"  AbstractContextManager: {hasattr(t, 'AbstractContextManager')}")

# The correct annotation for a contextmanager-decorated function
# A `@contextlib.contextmanager` decorated generator yields type T
# and returns a context manager that manages that type
print("\nCorrect annotation options:")
print("  Option 1: t.Callable[[], t.ContextManager[FileType]]  (if available)")
print("  Option 2: t.Callable[[], t.AbstractContextManager[FileType]]  (recommended)")
print("  Option 3: t.Callable[[], ContextManager[FileType]]  (from contextlib)")

# Check which is the standard Python 3.11+ pattern
import sys
print(f"\nPython version: {sys.version}")
EOF

Repository: zincware/IPSuite

Length of output: 459


🏁 Script executed:

#!/bin/bash
# Search for all Callable annotations in the file to find underspecified ones
rg 'Callable' ipsuite/dynamics/md_nodes.py

Repository: zincware/IPSuite

Length of output: 123


🏁 Script executed:

#!/bin/bash
# Check broader patterns for context manager typing in the codebase
rg 'Callable.*\[' ipsuite/ --type py | head -20

Repository: zincware/IPSuite

Length of output: 42


Parameterize _make_file_factory return type annotation.

The return type -> t.Callable lacks type parameters, which violates the requirement to use full type annotations. The decorated _factory function should be annotated to indicate it returns a callable that produces an h5py.File context manager.

Fix
-    def _make_file_factory(self, file_path: str) -> t.Callable:
+    def _make_file_factory(
+        self, file_path: str
+    ) -> t.Callable[[], t.ContextManager[h5py.File]]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ipsuite/dynamics/md_nodes.py` around lines 169 - 171, The return type of
_make_file_factory is missing parameters; update its signature to return a
callable that produces a context manager yielding an h5py.File (e.g. change def
_make_file_factory(self, file_path: str) -> t.Callable to def
_make_file_factory(self, file_path: str) -> t.Callable[[],
t.ContextManager[h5py.File]]), and ensure the inner _factory still has an
appropriate annotation (e.g. -> t.Generator[h5py.File, None, None] or use
contextlib typing) so tools see the concrete context-manager return type.

if self.state.rev is None and self.state.remote is None:
with h5py.File(file_path, "r") as file:
yield file
else:
with self.state.fs.open(file_path, "rb") as f:
with h5py.File(f) as file:
yield file

return _factory

@property
def structures(self) -> list[list[ase.Atoms]]:
"""Return the structures as a list of lists of Atoms."""
files = list(self.state.fs.glob((self.frames_path / "*.h5").as_posix()))
structures = []
for file in files:
with self.state.fs.open(file, "rb") as f:
with h5py.File(f) as file:
structures.append(znh5md.IO(file_handle=file)[:])
return structures
def frames(self) -> znh5md.IO | list[ase.Atoms]:
files = sorted(self.state.fs.glob((self.frames_path / "*.h5").as_posix()))
if not files:
raise FileNotFoundError(f"No HDF5 files found in {self.frames_path}")
if len(files) == 1:
return znh5md.IO(file_factory=self._make_file_factory(files[0]))
return sum(
[znh5md.IO(file_factory=self._make_file_factory(f))[:] for f in files],
[],
)

def initialize_md(self):
self.model_outs.mkdir(parents=True, exist_ok=True)
Expand Down Expand Up @@ -263,7 +266,7 @@ def run_md(self, idx: int, atoms: ase.Atoms) -> int: # noqa: C901
ncols=120,
)
io = znh5md.IO(
self.frames_path / f"{idx}.h5",
self.frames_path / "md.h5",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -nP 'initialize_md\(|md\.h5|unlink\(|remove\(' ipsuite/dynamics/md_nodes.py -C3

Repository: zincware/IPSuite

Length of output: 1748


🏁 Script executed:

cat -n ipsuite/dynamics/md_nodes.py | sed -n '250,280p'

Repository: zincware/IPSuite

Length of output: 1415


🏁 Script executed:

wc -l ipsuite/dynamics/md_nodes.py

Repository: zincware/IPSuite

Length of output: 92


🏁 Script executed:

rg -A5 'znh5md\.IO\(' ipsuite/dynamics/md_nodes.py

Repository: zincware/IPSuite

Length of output: 587


🏁 Script executed:

fd -e py ipsuite/dynamics/ | xargs rg -l 'md\.h5|znh5md' | head -10

Repository: zincware/IPSuite

Length of output: 759


🏁 Script executed:

cat -n ipsuite/dynamics/md_nodes.py | sed -n '255,275p'

Repository: zincware/IPSuite

Length of output: 1026


🏁 Script executed:

rg 'znh5md\.IO' -B2 -A2 ipsuite/

Repository: zincware/IPSuite

Length of output: 9974


🏁 Script executed:

fd . tests/ --full-path | xargs rg 'znh5md\.IO|md\.h5' -l

Repository: zincware/IPSuite

Length of output: 127


🏁 Script executed:

rg 'import znh5md|from znh5md' ipsuite/dynamics/md_nodes.py -B2 -A2

Repository: zincware/IPSuite

Length of output: 196


🏁 Script executed:

rg 'znh5md' pyproject.toml

Repository: zincware/IPSuite

Length of output: 80


🏁 Script executed:

rg 'create_file|\.extend\(|\.append\(' ipsuite/dynamics/md_nodes.py -B2 -A2

Repository: zincware/IPSuite

Length of output: 729


🏁 Script executed:

cat -n ipsuite/dynamics/md_nodes.py | sed -n '259,310p'

Repository: zincware/IPSuite

Length of output: 2603


🏁 Script executed:

rg 'io\.create_file\(\)' ipsuite/ -B3 -A3

Repository: zincware/IPSuite

Length of output: 443


Delete md.h5 file in initialize_md() to prevent frame accumulation across reruns.

Without deleting the existing md.h5 file, calls to io.extend() will append frames from reruns to the previous trajectory, silently mixing old and new frames. Add file deletion before the simulation loop starts.

Proposed fix
 def initialize_md(self):
     self.model_outs.mkdir(parents=True, exist_ok=True)
     self.frames_path.mkdir(parents=True, exist_ok=True)
+    md_file = self.frames_path / "md.h5"
+    if md_file.exists():
+        md_file.unlink()
     self.laufband_path.parent.mkdir(parents=True, exist_ok=True)
     (self.model_outs / "outs.txt").write_text("Lorem Ipsum")
     self.rng = np.random.default_rng(self.seed)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ipsuite/dynamics/md_nodes.py` at line 260, The initialize_md() routine must
remove any existing trajectory file before the sim run to avoid io.extend()
appending old frames; add logic in initialize_md() to check for and delete
self.frames_path / "md.h5" (or otherwise remove that file) prior to entering the
simulation loop so each run starts with a fresh file. Ensure deletion handles
missing-file quietly and raises/logs only unexpected IO errors.

)
# We do not save the starting configuration. E.g. step 0 is not saved!
with Live(console=progress.console, refresh_per_second=10) as live:
Expand Down Expand Up @@ -388,10 +391,8 @@ class ASEMDSafeSampling(ASEMD):
Output path for model-specific output files.
laufband_path : Path
Path to the job queue database file.
frames : list[ase.Atoms]
Property that returns all trajectory frames from saved files.
structures : list[list[ase.Atoms]]
Property that returns structures organized by simulation run.
frames : znh5md.IO
Property that returns a lazy IO handle for trajectory frames.

Examples
--------
Expand Down
Loading