Utilities Module ================ The Utilities module provides a set of helpful tools for working with machine learning potentials and molecular dynamics trajectories. It includes functions for evaluating model accuracy, preparing datasets, manipulating trajectories, and citing relevant software packages. Capabilities ------------ The Utilities module includes the following tools: Model Error Evaluation (EVAL_ERROR) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Calculates errors between machine learning potential predictions and reference data * Allows also .xyz format for forces and positions * Computes root mean square error (RMSE) for both forces and energies * Outputs error statistics to a text file Reference Data Preparation (PREPARE_EVAL_ERROR) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Extracts frames from a MLIP trajectory file at specified intervals * Creates input files for CP2K reference calculations * Prepares necessary files for model evaluation * Generates runscripts for obtaining reference data Trajectory Frame Extraction (EXTRACT_XYZ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Extracts frames from XYZ trajectory files at specified intervals * Creates a new XYZ file with the extracted frames * Preserves all metadata from the original trajectory Citation Generation (CITATIONS) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Provides proper citations for the used frameworks and MLIPs * Extracts model information from log files * Generates BibTeX-formatted citations for publications * Ensures proper attribution of methods used in calculations Benchmarking (BENCHMARK) ~~~~~~~~~~~~~~~~~~~~~~~~ * Creates a structured directory for benchmarking multiple ML potentials * Supports MACE, MatterSim, SevenNet and ORB potentials * Enables direct comparison of performance across models * Provides two operating modes: - MD: Forward simulation with identical starting conditions - RECALC: Recalculation of forces/energies from a reference trajectory * Supports flexible model selection and configuration for each potential * Automatically organizes results and logs for reproducibility * Automatically generates comparison statistics when using RECALC mode Logger Access ~~~~~~~~~~~~~ * Provides access to the run logger and model logger * Displays summaries of previous calculations * Lists available fine-tuned models * Helps track project history and available resources * Facilitates export of run logs to PDF format for documentation (only the last 50 runs) Usage examples: .. code-block:: bash amaceing_utils -l=model # Shows fine-tuned models amaceing_utils -l=run # Shows run history amaceing_utils -l=runexport # Export the run logger to a pdf Usage ----- Command-line Usage ~~~~~~~~~~~~~~~~~~ **Interactive Q&A session:** .. code-block:: bash amaceing_utils This guides you through: 1. Selecting the utility function to use 2. Providing necessary inputs for the selected function 3. Configuring parameters specific to that function **Direct Command Line Usage:** .. code-block:: bash amaceing_utils -rt="FUNCTION_NAME" -c="{'parameter1': 'value1', 'parameter2': 'value2', ...}" Where FUNCTION_NAME is one of: EVAL_ERROR, PREPARE_EVAL_ERROR, EXTRACT_XYZ, CITATIONS, BENCHMARK For model error evaluation: .. code-block:: bash amaceing_utils -rt="EVAL_ERROR" -c="{'ener_filename_ground_truth': 'eval_run-pos-1.xyz', 'force_filename_ground_truth': 'force.xyz', 'ener_filename_compare': 'mace_coord.xyz', 'force_filename_compare': 'mace_force.xyz'}" For trajectory frame extraction: .. code-block:: bash amaceing_utils -rt="EXTRACT_XYZ" -c="{'coord_file': 'trajectory.xyz', 'each_nth_frame': '10'}" For benchmarking: .. code-block:: bash amaceing_utils -rt="BENCHMARK" -c="{'mode': 'MD', 'coord_file': 'coord.xyz', 'pbc_list': '[10 0 0 0 10 0 0 0 10]', 'force_nsteps': '20000', 'mace_model': '['mace_mp' 'small']', 'mattersim_model': 'small', 'sevennet_model': '['7net-mf-ompa' 'mpa']', 'orb_model': '['orb_v3_conservative_inf' 'omat']', 'grace_model': 'GRACE-1L-OMAT'}" To view logger information: .. code-block:: bash amaceing_utils -l=model # Shows fine-tuned models amaceing_utils -l=run # Shows run history amaceing_utils -l=runexport # Export the run logger to a pdf Python API ~~~~~~~~~~ .. code-block:: python from amaceing_toolkit.workflow import utils_api config = { 'ener_filename_ground_truth': 'position_energy_cp2k.xyz', 'force_filename_ground_truth': 'force_cp2k.xyz', 'ener_filename_compare': 'mlip_position_energy.xyz', 'force_filename_compare': 'mlip_force.xyz' } utils_api(run_type='EVAL_ERROR', config=config) Output and File Structure ------------------------- Each utility function produces different outputs: * **EVAL_ERROR**: Creates ``errors.txt`` with statistics on force and energy errors * **PREPARE_EVAL_ERROR**: Creates ``mace_coord.xyz``, ``mace_force.xyz``, and ``pbc`` files * **CITATIONS**: Prints the BibTeX citations for the used frameworks and models * **EXTRACT_XYZ**: Creates a new XYZ file with extracted frames * **BENCHMARK**: Creates directories ``mace/``, ``mattersim/``, ``sevennet/`` and ``orb/`` with input files Technical Details ----------------- * EVAL_ERROR assumes that the ground thruth data has to be converted (Force units: converted from Hartree/Bohr to eV/Å; Energy units: converted from Hartree to eV) * Frame extraction uses consistent time intervals based on frame numbers * Error statistics include both absolute and relative errors * Benchmarking supports both forward simulation and reference trajectory recalculation