.. -*- coding: utf-8 -*-
.. _IMD-format:
==================================================================
IMD (Data streamed via Interactive Molecular Dynamics protocol v3)
==================================================================
.. include:: classes/IMD.txt
Real-time streaming of simulation data between molecular dynamics engines and receiving clients can be achieved using IMD protocols like IMDv2 and IMDv3. The :class:`~MDAnalysis.coordinates.IMD.IMDReader` implements the IMDv3 protocol, enabling live streaming of ongoing simulation data.
.. note::
MDAnalysis supports **IMDv3 only**, which provides continuous, gap-free streaming and is implemented in modern versions of GROMACS, LAMMPS, and NAMD. IMDv2, while widely available, was designed primarily for visualization and allows gaps in the data stream.
What is Streaming?
==================
Streaming involves processing data in real-time as it is generated, rather than storing it for later analysis. In molecular dynamics, this means sending simulation data to a client on-the-fly while the simulation is running, without writing large trajectory files to disk.
This can be achieved through a TCP/IP socket connection between the simulation engine and receiving client, transmitting coordinates, velocities, forces, energies, and timing information using the IMDv3 protocol.
MDAnalysis's :class:`~MDAnalysis.coordinates.IMD.IMDReader` uses the `imdclient `_ package and provides a familiar interface for reading streaming data, similar to other trajectory readers in MDAnalysis.
When to Use Streaming?
======================
Streaming analysis is particularly valuable for:
**Long-running simulations**
Early detection of problems (crashes, artifacts, equilibration issues) can save computational resources.
**Adaptive sampling workflows**
Real-time analysis can guide simulation parameters or trigger enhanced sampling methods.
**Interactive research**
Immediate feedback allows researchers to make informed decisions about continuing, modifying, or terminating simulations.
**Storage-constrained environments**
Analyze data as it's generated without storing large trajectory files.
Installation and Setup
======================
Required Dependencies
---------------------
The IMDReader requires the ``imdclient`` package:
.. code-block:: bash
pip install imdclient
.. note::
MDAnalysis requires ``imdclient >= 0.2.2`` for its current implementation.
MD Engine Configuration
-----------------------
We provide below example configurations for enabling IMDv3 streaming in popular MD engines.
**GROMACS**
Add IMD settings to your ``.mdp`` file:
.. code-block:: text
; IMD settings
IMD-group = System
IMD-version = 3
IMD-nst = 1
IMD-time = No
IMD-coords = Yes
IMD-vels = No
IMD-forces = No
IMD-box = No
IMD-unwrap = No
IMD-energies = No
Run with IMD enabled:
.. code-block:: bash
gmx mdrun -v -nt 4 -imdwait -imdport 8889
**LAMMPS**
Use the IMD fix in your input script:
.. code-block:: text
# IMD setup
fix ID group-ID imd trate version 3 unwrap time box coordinates velocities forces
Run your LAMMPS simulation as usual.
**NAMD**
Add IMD configuration to your NAMD configuration file:
.. code-block:: text
# IMD Settings
IMDon yes
IMDport
IMDwait
IMDfreq
IMDsendPositions
IMDsendEnergies
IMDsendTime
IMDsendBoxDimensions
IMDsendVelocities
IMDsendForces
IMDwrapPositions
Run your NAMD simulation as usual.
.. seealso::
For detailed engine-specific setup instructions, see the `imdclient simulation engine documentation `_.
Basic Usage
===========
Connecting to a Running Simulation
-----------------------------------
Once your simulation is running with IMD enabled:
.. code-block:: python
import MDAnalysis as mda
# Connect to the simulation
u = mda.Universe("topol.tpr", "imd://localhost:8889", buffer_size=10*1024*1024)
# Streaming analysis loop
for ts in u.trajectory:
print(f"Time: {ts.time:.2f} ps, Step: {ts.data.get('step', 'N/A')}")
# Your analysis code here
selected_atoms = u.select_atoms("protein and name CA")
center_of_mass = selected_atoms.center_of_mass()
print(f"Protein COM: {center_of_mass}")
# Optional: break on some condition
if ts.time > 1000: # Stop after 1000 ps
break
Real-time Quality Control
-------------------------
Monitor simulation health in real-time:
.. code-block:: python
import MDAnalysis as mda
import numpy as np
u = mda.Universe("system.tpr", "imd://localhost:8889")
previous_positions = None
for ts in u.trajectory:
current_positions = u.atoms.positions.copy()
# Check for simulation artifacts
if previous_positions is not None:
displacement = np.linalg.norm(current_positions - previous_positions, axis=1)
max_displacement = np.max(displacement)
if max_displacement > 10.0: # Atoms moved > 10 Å in one step
print(f"WARNING: Large displacement detected at {ts.time} ps: {max_displacement:.2f} Å")
# Monitor energies if available
if 'potential' in ts.data:
print(f"Potential energy: {ts.data['potential']:.2f}")
previous_positions = current_positions
Advanced Features
=================
Buffer Management
-----------------
For compute-intensive analysis, increase the buffer size to reduce communication overhead:
.. code-block:: python
# Larger buffer for better performance
u = mda.Universe("topol.tpr", "imd://localhost:8889", buffer_size=50*1024*1024)
Connection Management
---------------------
Always ensure proper cleanup, especially in interactive environments like Jupyter notebooks et al.:
.. code-block:: python
import MDAnalysis as mda
try:
u = mda.Universe("topol.tpr", "imd://localhost:8889")
for ts in u.trajectory:
# Your analysis here
pass
except Exception as e:
print(f"Error during streaming: {e}")
finally:
# Always close the connection
u.trajectory.close()
Available Data
--------------
The IMDReader provides access to additional simulation data through ``ts.data``:
* ``dt``: Time step size in picoseconds
* ``step``: Current simulation step number
* Energy terms: ``potential``, ``total``, etc. (engine-dependent)
.. code-block:: python
for ts in u.trajectory:
print(f"Step {ts.data.get('step')}: dt={ts.data.get('dt')} ps")
# Available energy terms vary by MD engine
for key, value in ts.data.items():
if key not in ['dt', 'step']:
print(f" {key}: {value}")
Multiple Client Connections
===========================
The ability to connect multiple clients to the same IMD port depends on the MD engine implementation:
* **GROMACS**: Typically supports single client connections
* **LAMMPS**: May support multiple clients (version-dependent)
* **NAMD**: Supports multiple clients
.. important::
Even when multiple connections are supported, each receives an independent data stream. Different clients may receive different data depending on the engine configuration.
Integration with MDAnalysis Tools
=================================
Most MDAnalysis analysis classes work with streaming data, but some limitations apply:
**Compatible Analysis**
.. code-block:: python
from MDAnalysis.analysis import distances, contacts
u = mda.Universe("system.tpr", "imd://localhost:8889")
for ts in u.trajectory:
# Distance calculations work normally
protein = u.select_atoms("protein")
rg = protein.radius_of_gyration()
# Contact analysis
selection1 = u.select_atoms("resid 1-10")
selection2 = u.select_atoms("resid 50-60")
dist_array = distances.distance_array(selection1.positions, selection2.positions)
**Limitations with Streaming**
Some analysis methods require the complete trajectory and won't work with streaming:
.. code-block:: python
# These will NOT work with streaming:
# - trajectory.timeseries()
# - Most analysis classes that need multiple passes
# - Random frame access (trajectory[10])
# - Backward iteration
Important Limitations
=====================
Streaming analysis has fundamental constraints due to its real-time nature:
**Data Access Limitations**
* **No random access**: Cannot jump to arbitrary frames or seek backwards
* **Forward-only**: Can only iterate through frames as they arrive
* **Single-use**: Cannot restart iteration once the stream is consumed
* **No trajectory length**: Total frame count unknown until simulation ends
* **No independent copies**: Cannot create multiple reader instances for the same stream
**Analysis Constraints**
* **No timeseries methods**: Cannot use ``trajectory.timeseries()``
* **No bulk operations**: Cannot extract all data at once
* **Limited multiprocessing**: Cannot split across processes
* **Single client**: Only one reader per IMD stream (engine-dependent)
**Practical Considerations**
.. code-block:: python
# This WILL work - forward iteration
for ts in u.trajectory:
analysis_data.append(calculate_something(ts))
# This will NOT work - random access
ts = u.trajectory[10] # TypeError
# This will NOT work - backwards iteration
for ts in u.trajectory[::-1]: # ValueError
pass
# This will NOT work - restarting iteration
for ts in u.trajectory:
break
for ts in u.trajectory: # Won't start from beginning
pass
See Also
========
* :class:`~MDAnalysis.coordinates.IMD.IMDReader` - Technical API documentation
* :class:`~MDAnalysis.coordinates.base.StreamReaderBase` - Base class for streaming readers
* `imdclient documentation `_ - Complete imdclient package documentation
* `IMDv3 protocol specification `_ - Technical protocol details