pyiron.base.generic.hdfio module

class pyiron.base.generic.hdfio.FileHDFio(file_name, h5_path='/', mode='a')[source]

Bases: object

Class that provides all info to access a h5 file. This class is based on h5io.py, which allows to get and put a large variety of jobs to/from h5

Parameters
  • file_name (str) – absolute path of the HDF5 file

  • h5_path (str) – absolute path inside the h5 path - starting from the root group

  • mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes

file_name
absolute path to the HDF5 file
h5_path
path inside the HDF5 file - also stored as absolute path
history
previously opened groups / folders
file_exists
boolean if the HDF5 was already written
base_name
name of the HDF5 file but without any file extension
file_path
directory where the HDF5 file is located
is_root
boolean if the HDF5 object is located at the root level of the HDF5 file
is_open
boolean if the HDF5 file is currently opened - if an active file handler exists
is_empty
boolean if the HDF5 file is empty
property base_name

Name of the HDF5 file - but without the file extension .h5

Returns

file name without the file extension

Return type

str

close()[source]

Close the current HDF5 path and return to the path before the last open

copy()[source]

Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.

Returns

New FileHDFio object pointing to the same HDF5 file

Return type

FileHDFio

copy_to(destination, file_name=None, maintain_name=True)[source]

Copy the content of the HDF5 file to a new location

Parameters
  • destination (FileHDFio) – FileHDFio object pointing to the new location

  • file_name (str) – name of the new HDF5 file - optional

  • maintain_name (bool) – by default the names of the HDF5 groups are maintained

Returns

FileHDFio object pointing to a file which now contains the same content as file of the current

FileHDFio object.

Return type

FileHDFio

create_group(name)[source]

Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.

Parameters

name (str) – name of the HDF5 group

Returns

FileHDFio object pointing to the new group

Return type

FileHDFio

property file_exists

Check if the HDF5 file exists already

Returns

[True/False]

Return type

bool

property file_name

Get the file name of the HDF5 file

Returns

absolute path to the HDF5 file

Return type

str

property file_path

Path where the HDF5 file is located - posixpath.dirname()

Returns

HDF5 file location

Return type

str

static file_size(hdf)[source]

Get size of the HDF5 file

Parameters

hdf (FileHDFio) – hdf file

Returns

file size in Bytes

Return type

float

get(key)[source]

Internal wrapper function for __getitem__() - self[name]

Parameters

key (str, slice) – path to the data or key of the data object

Returns

data or data object

Return type

dict, list, float, int

get_from_table(path, name)[source]

Get a specific value from a pandas.Dataframe

Parameters
  • path (str) – relative path to the data object

  • name (str) – parameter key

Returns

the value associated to the specific parameter key

Return type

dict, list, float, int

get_pandas(name)[source]

Load a dictionary from the HDF5 file and display the dictionary as pandas Dataframe

Parameters

name (str) – HDF5 node name

Returns

The dictionary is returned as pandas.Dataframe object

Return type

pandas.Dataframe

get_size(hdf)[source]

Get size of the groups inside the HDF5 file

Parameters

hdf (FileHDFio) – hdf file

Returns

file size in Bytes

Return type

float

groups()[source]

Filter HDF5 file by groups

Returns

an HDF5 file which is filtered by groups

Return type

FileHDFio

property h5_path

Get the path in the HDF5 file starting from the root group - meaning this path starts with ‘/’

Returns

HDF5 path

Return type

str

hd_copy(hdf_old, hdf_new, exclude_groups=None)[source]
Parameters
  • hdf_old (ProjectHDFio) – old hdf

  • hdf_new (ProjectHDFio) – new hdf

  • exclude_groups (list/None) – list of groups to delete

property is_empty

Check if the HDF5 file is empty

Returns

[True/False]

Return type

bool

property is_root

Check if the current h5_path is pointing to the HDF5 root group.

Returns

[True/False]

Return type

bool

items()[source]

List all keys and values as items of all groups and nodes of the HDF5 file

Returns

list of sets (key, value)

Return type

list

keys()[source]

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns

all groups and nodes

Return type

list

list_all()[source]

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns

{‘groups’: [list of groups], ‘nodes’: [list of nodes]}

Return type

dict

list_dirs()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns

list of groups in pytables for the path self.h5_path

Return type

(list)

list_groups()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns

list of groups in pytables for the path self.h5_path

Return type

(list)

list_nodes()[source]

List all groups and nodes of the HDF5 file

Returns

list of nodes

Return type

list

listdirs()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns

list of groups in pytables for the path self.h5_path

Return type

(list)

nodes()[source]

Filter HDF5 file by nodes

Returns

an HDF5 file which is filtered by nodes

Return type

FileHDFio

open(h5_rel_path)[source]

Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path only the h5_path is set correspondingly otherwise the group is created first.

Parameters

h5_rel_path (str) – relative path from the current HDF5 path - h5_path - to the new group

Returns

FileHDFio object pointing to the new group

Return type

FileHDFio

put(key, value)[source]

Store data inside the HDF5 file

Parameters
  • key (str) – key to store the data

  • value (pandas.DataFrame, pandas.Series, dict, list, float, int) – basically any kind of data is supported

remove_file()[source]

Remove the HDF5 file with all the related content

remove_group()[source]

Remove an HDF5 group - if it exists. If the group does not exist no error message is raised.

rewrite_hdf5(job_name, info=False, exclude_groups=None)[source]
Parameters
  • info (True/False) – whether to give the information on how much space has been saved

  • exclude_groups (list/None) – list of groups to delete from hdf

show_hdf()[source]

Iterating over the HDF5 datastructure and generating a human readable graph.

values()[source]

List all values for all groups and nodes of the HDF5 file

Returns

list of all values

Return type

list

class pyiron.base.generic.hdfio.HDFStoreIO(path, mode=None, complevel=None, complib=None, fletcher32=False, **kwargs)[source]

Bases: pandas.io.pytables.HDFStore

dict-like IO interface for storing pandas objects in PyTables either Fixed or Table format. - copied from pandas.HDFStore

Parameters
  • path (str) – File path to HDF5 file

  • mode (str) –

    {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ 'r'

    Read-only; no data can be modified.

    'w'

    Write; a new file is created (an existing file with the same name would be deleted).

    'a'

    Append; an existing file is opened for reading and writing, and if the file does not exist it is created.

    'r+'

    It is similar to 'a', but the file must already exist.

  • complevel (int) – 1-9, default 0 If a complib is specified compression will be applied where possible

  • complib (str) – {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’, None}, default None If complevel is > 0 apply compression to objects written in the store wherever possible

  • fletcher32 (bool) – bool, default False If applying compression use the fletcher32 checksum

open(**kwargs)[source]

Open the file in the specified mode - copied from pandas.HDFStore.open()

Parameters

**kwargs – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes

Returns

self - in contrast to the original implementation in pandas.

Return type

HDFStoreIO

class pyiron.base.generic.hdfio.ProjectHDFio(project, file_name, h5_path=None, mode=None)[source]

Bases: pyiron.base.generic.hdfio.FileHDFio

The ProjectHDFio class connects the FileHDFio and the Project class, it is derived from the FileHDFio class but in addition the a project object instance is located at self.project enabling direct access to the database and other project related functionality, some of which are mapped to the ProjectHDFio class as well.

Parameters
  • project (Project) – pyiron Project the current HDF5 project is located in

  • file_name (str) – name of the HDF5 file - in contrast to the FileHDFio object where file_name represents the absolute path of the HDF5 file.

  • h5_path (str) – absolute path inside the h5 path - starting from the root group

  • mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes

.. attribute:: project

Project instance the ProjectHDFio object is located in

.. attribute:: root_path

the pyiron user directory, defined in the .pyiron configuration

.. attribute:: project_path

the relative path of the current project / folder starting from the root path of the pyiron user directory

.. attribute:: path

the absolute path of the current project / folder plus the absolute path in the HDF5 file as one path

.. attribute:: file_name

absolute path to the HDF5 file

.. attribute:: h5_path

path inside the HDF5 file - also stored as absolute path

.. attribute:: history

previously opened groups / folders

.. attribute:: file_exists

boolean if the HDF5 was already written

.. attribute:: base_name

name of the HDF5 file but without any file extension

.. attribute:: file_path

directory where the HDF5 file is located

.. attribute:: is_root

boolean if the HDF5 object is located at the root level of the HDF5 file

.. attribute:: is_open

boolean if the HDF5 file is currently opened - if an active file handler exists

.. attribute:: is_empty

boolean if the HDF5 file is empty

.. attribute:: user

current unix/linux/windows user who is running pyiron

.. attribute:: sql_query

an SQL query to limit the jobs within the project to a subset which matches the SQL query.

.. attribute:: db

connection to the SQL database

.. attribute:: working_directory

working directory of the job is executed in - outside the HDF5 file

property base_name

The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.

Returns

current project path

Return type

str

copy()[source]

Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path

Returns

copy of the ProjectHDFio object

Return type

ProjectHDFio

create_hdf(path, job_name)[source]

Create an ProjectHDFio object to store project related information - for testing aggregated data

Parameters
  • path (str) – absolute path

  • job_name (str) – name of the HDF5 container

Returns

HDF5 object

Return type

ProjectHDFio

create_object(class_name, **qwargs)[source]

Internal function to create a pyiron object

Parameters
  • class_name (str) – name of a pyiron class

  • **qwargs – object parameters

Returns

defined by the pyiron class in class_name with the input from **qwargs

Return type

pyiron object

create_working_directory()[source]

Create the working directory on the file system if it does not exist already.

property db

Get connection to the SQL database

Returns

database conncetion

Return type

DatabaseAccess

get_job_id(job_specifier)[source]

get the job_id for job named job_name in the local project path from database

Parameters

job_specifier (str, int) – name of the job or job ID

Returns

job ID of the job

Return type

int

inspect(job_specifier)[source]

Inspect an existing pyiron object - most commonly a job - from the database

Parameters

job_specifier (str, int) – name of the job or job ID

Returns

Access to the HDF5 object - not a GenericJob object - use load() instead.

Return type

JobCore

load(job_specifier, convert_to_object=True)[source]

Load an existing pyiron object - most commonly a job - from the database

Parameters
  • job_specifier (str, int) – name of the job or job ID

  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.

Returns

Either the full GenericJob object or just a reduced JobCore object

Return type

GenericJob, JobCore

load_from_jobpath(job_id=None, db_entry=None, convert_to_object=True)[source]

Internal function to load an existing job either based on the job ID or based on the database entry dictionary.

Parameters
  • job_id (int) – Job ID - optional, but either the job_id or the db_entry is required.

  • db_entry (dict) – database entry dictionary - optional, but either the job_id or the db_entry is required.

  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.

Returns

Either the full GenericJob object or just a reduced JobCore object

Return type

GenericJob, JobCore

property path

Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.

Returns

absolute path

Return type

str

property project

Get the project instance the ProjectHDFio object is located in

Returns

pyiron project

Return type

Project

property project_path

the relative path of the current project / folder starting from the root path of the pyiron user directory

Returns

relative path of the current project / folder

Return type

str

remove_job(job_specifier, _unprotect=False)[source]

Remove a single job from the project based on its job_specifier - see also remove_jobs()

Parameters
  • job_specifier (str, int) – name of the job or job ID

  • _unprotect (bool) – [True/False] delete the job without validating the dependencies to other jobs - default=False

property root_path

the pyiron user directory, defined in the .pyiron configuration

Returns

pyiron user directory of the current project

Return type

str

property sql_query

Get the SQL query for the project

Returns

SQL query

Return type

str

to_object(object_type=None, **qwargs)[source]

Load the full pyiron object from an HDF5 file

Parameters
  • object_type – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set - optional

  • **qwargs – optional parameters [‘job_name’, ‘project’] - to specify the location of the HDF5 path

Returns

pyiron object

Return type

GenericJob

property user

Get current unix/linux/windows user who is running pyiron

Returns

username

Return type

str

property working_directory

Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem:

/absolute/path/to/the/file.h5/path/inside/the/hdf5/file

becomes:

/absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file

Returns

absolute path to the working directory

Return type

str