Exercise 2: Creating and working with structure databases
Contents
Exercise 2: Creating and working with structure databases#
Before the excercise, you should:
Finish exercise 1
The aim of this exercise is to make you familiar with:
Creating structure databases and working with them for potential fitting (day 2)
Importing necessary modules and creating a project#
This is done the same way as shown in the first exercise
import numpy as np
%matplotlib inline
import matplotlib.pylab as plt
import os
from pyiron import Project
pr = Project("creating_datasets")
Creating a structure “container” from the data#
We now go over the jobs generated in the first notebook to store structures, energies, and forces into a structure container which will later be used for potential fitting
Note: Usually these datasets are created using highly accurate DFT calculations. But for practical reasons, we only demonstrate how to do this using data from LAMMPS calculations (the workflow remain the same)
Access the project created in exercise 1. ..
means go up one folder in the directory tree as usual in linux.
pr_fs = pr["../first_steps"]
Create a TrainingContainer job (to store structures and databases).
container = pr.create.job.TrainingContainer('dataset_example')
Add structures from the E-V curves#
For starters, we append structures from the energy volume curves we calculated earlier
for job in pr_fs["E_V_curve"].iter_jobs(status="finished"):
container.include_job(job)
We can obtain this data as a pandas
table
container.to_pandas()
name | atoms | energy | forces | stress | number_of_atoms | |
---|---|---|---|---|---|---|
0 | job_a_3_8 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.192897 | [[1.6276043110675176e-16, 1.0529105848988851e-16, 5.1718187378489473e-17]] | [25.037460606087844, 25.03746060546885, 25.03746060312137, 1.2058153515681625e-10, -5.4886913858354095e-11, 5.489273462444544e-11] | 1 |
1 | job_a_3_9 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.319542 | [[7.639186604470375e-18, 1.2897999183801789e-17, 6.560662375038692e-17]] | [11.783580963401858, 11.783580963641624, 11.783580962525912, -9.081682946998626e-10, -5.281239282339811e-10, 5.281079211272299e-10] | 1 |
2 | job_a_4_0 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.367063 | [[-3.5024524628727396e-17, -1.320930466294525e-17, 5.849496262865057e-18]] | [2.177486595771194, 2.1774865945028847, 2.1774865945028834, -1.07506321000983e-09, 1.2040691217407586e-09, 7.657961759832688e-10] | 1 |
3 | job_a_4_1 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.360600 | [[-2.237762269513316e-17, -4.0689075283847526e-17, 2.1062919550300275e-17]] | [-3.3265634524504444, -3.3265634530820014, -3.3265634530820085, -6.528356607304887e-10, 1.6521880752407014e-12, 6.566095180460252e-10] | 1 |
4 | job_a_4_2 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.317017 | [[2.140556230444804e-17, 9.465137265930533e-17, -1.6146749725116617e-17]] | [-7.344005402848352, -7.344005402593722, -7.344005404806233, -4.6368149924092e-10, -7.669372280361131e-10, 7.669350452488288e-10] | 1 |
5 | job_a_4_3 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.241535 | [[-5.0018187940959333e-17, -7.753256254350387e-17, -7.947668332487412e-17]] | [-10.206225126673713, -10.206225127480902, -10.2062251274809, -6.120026228018106e-11, 5.826092092320323e-10, 7.850612746551634e-11] | 1 |
6 | job_a_4_4 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.145751 | [[-7.31320096256601e-17, 2.773206044106321e-16, -1.2031135854225408e-16]] | [-11.04382992252993, -11.043829922467113, -11.043829922415048, -1.632215571589768e-11, 6.058689905330539e-12, -6.060145096853376e-12] | 1 |
Add structures from the MD#
We also add some structures obtained from the MD simulations
Reloading the MD job. Indexing a project loads jobs within.
job_md = pr_fs["lammps_job"]
We can now iterate over the structures within and add each of them to the container.
traj_length = job_md.number_of_structures
stride = 10
By default include_job will fetch the last computation step from the given job for other steps you have to explicitly pass which step you want.
for i in range(0, traj_length, stride):
container.include_job(job_md, iteration_step=i)
Add some defect structures (vacancies, surfaces, etc)#
It’s necessary to also include some defect structures, and surfaces to the training dataset.
Setup a MD calculation for a structure with a vacancy.
job_lammps = pr.create.job.Lammps("lammps_job_vac")
job_lammps.structure = pr.create.structure.bulk("Al", cubic=True, a=3.61).repeat([3, 3, 3])
remove the first atom of the structure to create the vacancy
del job_lammps.structure[0]
job_lammps.potential = "2005--Mendelev-M-I--Al-Fe--LAMMPS--ipr1"
job_lammps.calc_md(temperature=800, pressure=0, n_ionic_steps=10000)
job_lammps.run()
The job lammps_job_vac was saved and received the ID: 94
Setup a MD calculation for a surface structure
job_lammps = pr.create.job.Lammps("lammps_job_surf")
job_lammps.structure = pr.create.structure.surface("Al", surface_type="fcc111", size=(4, 4, 8), vacuum=12, orthogonal=True)
job_lammps.potential = "2005--Mendelev-M-I--Al-Fe--LAMMPS--ipr1"
job_lammps.calc_md(temperature=800, pressure=0, n_ionic_steps=10000)
job_lammps.run()
The job lammps_job_surf was saved and received the ID: 95
pr
{'groups': [], 'nodes': ['lammps_job_vac', 'lammps_job_surf']}
We now add these structures to the dataset like we did before.
for job_md in pr.iter_jobs(status="finished", hamilton="Lammps"):
stride = 10
for i in range(0, job.number_of_structures, stride):
container.include_job(job_md, iteration_step=i)
We run the job to store this dataset in the pyiron database. Without running the training container “job” the data will not saved!
container.run()
The job dataset_example was saved and received the ID: 96
pr.job_table()
id | status | chemicalformula | job | subjob | projectpath | project | timestart | timestop | totalcputime | computer | hamilton | hamversion | parentid | masterid | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 94 | finished | Al107 | lammps_job_vac | /lammps_job_vac | /home/jovyan/ | potentials/introduction/creating_datasets/ | 2022-06-07 16:37:46.720756 | 2022-06-07 16:37:49.823856 | 3.0 | pyiron@jupyter-m-2epoul#1 | Lammps | 0.1 | None | None |
1 | 95 | finished | Al128 | lammps_job_surf | /lammps_job_surf | /home/jovyan/ | potentials/introduction/creating_datasets/ | 2022-06-07 16:37:51.214853 | 2022-06-07 16:37:53.832937 | 2.0 | pyiron@jupyter-m-2epoul#1 | Lammps | 0.1 | None | None |
2 | 96 | finished | None | dataset_example | /dataset_example | /home/jovyan/ | potentials/introduction/creating_datasets/ | 2022-06-07 16:37:56.081557 | NaT | NaN | pyiron@jupyter-m-2epoul#1 | TrainingContainer | 0.4 | None | None |
Reloading the dataset#
This dataset can now be reloaded anywhere to use in the potential fitting procedures
dataset = pr["dataset_example"]
dataset.to_pandas()
name | atoms | energy | forces | stress | number_of_atoms | |
---|---|---|---|---|---|---|
0 | job_a_3_8 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.192897 | [[1.6276043110675176e-16, 1.0529105848988851e-16, 5.1718187378489473e-17]] | [25.037460606087844, 25.03746060546885, 25.03746060312137, 1.2058153515681625e-10, -5.4886913858354095e-11, 5.489273462444544e-11] | 1 |
1 | job_a_3_9 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.319542 | [[7.639186604470375e-18, 1.2897999183801789e-17, 6.560662375038692e-17]] | [11.783580963401858, 11.783580963641624, 11.783580962525912, -9.081682946998626e-10, -5.281239282339811e-10, 5.281079211272299e-10] | 1 |
2 | job_a_4_0 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.367063 | [[-3.5024524628727396e-17, -1.320930466294525e-17, 5.849496262865057e-18]] | [2.177486595771194, 2.1774865945028847, 2.1774865945028834, -1.07506321000983e-09, 1.2040691217407586e-09, 7.657961759832688e-10] | 1 |
3 | job_a_4_1 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.360600 | [[-2.237762269513316e-17, -4.0689075283847526e-17, 2.1062919550300275e-17]] | [-3.3265634524504444, -3.3265634530820014, -3.3265634530820085, -6.528356607304887e-10, 1.6521880752407014e-12, 6.566095180460252e-10] | 1 |
4 | job_a_4_2 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.317017 | [[2.140556230444804e-17, 9.465137265930533e-17, -1.6146749725116617e-17]] | [-7.344005402848352, -7.344005402593722, -7.344005404806233, -4.6368149924092e-10, -7.669372280361131e-10, 7.669350452488288e-10] | 1 |
5 | job_a_4_3 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.241535 | [[-5.0018187940959333e-17, -7.753256254350387e-17, -7.947668332487412e-17]] | [-10.206225126673713, -10.206225127480902, -10.2062251274809, -6.120026228018106e-11, 5.826092092320323e-10, 7.850612746551634e-11] | 1 |
6 | job_a_4_4 | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -3.145751 | [[-7.31320096256601e-17, 2.773206044106321e-16, -1.2031135854225408e-16]] | [-11.04382992252993, -11.043829922467113, -11.043829922415048, -1.632215571589768e-11, 6.058689905330539e-12, -6.060145096853376e-12] | 1 |
7 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -363.917370 | [[5.8841820305133305e-15, 3.7990444123892135e-16, 2.740863092043364e-16], [5.02375918642883e-15, -1.5751289161869397e-15, -1.3274971399912499e-15], [1.02695629777827e-15, 8.812395257962181e-16, -8... | [0.999556665124294, 0.9904736758167861, 0.824951894171107, -0.0179550181978282, 0.0636336051961363, -0.042563616603965106] | 108 |
8 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -352.327898 | [[-0.75036385499113, 0.4598380918639449, 0.725200216845603], [-0.271732166045788, 0.302073802280348, 0.257384300490495], [0.448407157614891, -0.296549448310268, 0.241166662468148], [-0.00933123696... | [-0.20333593198685002, -0.0514950077978542, -0.380336649881006, -0.192008378726288, -0.000849147802610262, -0.021247590027086802] | 108 |
9 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -349.888834 | [[-0.380822550162949, 0.795020858927261, 0.552227922138795], [-0.815959874301847, -0.02386117799239805, -0.47623155209815404], [-0.286795110662345, 0.30418979949872, 0.970569998348215], [-0.550088... | [-0.636237386917572, -1.22215332293502, -0.718802458515107, 0.0334113551032086, -0.467781756142989, -0.15083572558775] | 108 |
10 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -350.830057 | [[-0.420012726158848, -0.266177748010431, -0.10061532349639205], [0.127384145208824, -0.248152628480852, -0.154576243850877], [-0.968537642185758, -0.39199433409687007, 0.11778481825176891], [-0.5... | [-0.5131786603043991, 0.0857860295485518, -0.487658631946179, -0.00960577613028209, -0.194135185700395, -0.21118720406901198] | 108 |
11 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -350.702007 | [[0.9153781717748, -0.5196757756775509, 0.498517073710415], [-0.667528324894004, -0.21672146416275304, -0.31211194811421505], [0.120312800491091, -0.043411302241849095, -0.406043538747074], [0.235... | [-0.51351056300053, -0.20205565562462502, -0.349611569619353, 0.0656675082015293, -0.549794514544474, -0.0898565835636916] | 108 |
12 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -350.283347 | [[0.235132393610786, -0.92320491047853, 0.23010090806109595], [0.4464059221284, 0.599380126332439, -0.47014243704993197], [-0.0750267922586507, 0.262245923203885, 0.479967633267096], [-0.694735048... | [-0.578429638769351, -0.434046865104863, -0.168847000423391, 0.144047568705863, -0.10672866413529901, -0.318309629189494] | 108 |
13 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -351.595189 | [[-0.10082749580572, 0.523484145402097, -1.0350343625323], [0.354729759304219, 0.701653159364364, 0.7878115361205081], [-0.0367556462367837, -0.495902791299659, 1.00115978036795], [-0.010109163992... | [-0.48245036720220597, -0.6429619358987481, -0.45615891646821305, 0.12156601043839001, 0.028722694543442204, -0.138869603615178] | 108 |
14 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -351.355429 | [[-0.53033805263242, 0.30151077109769797, 0.48889617311781], [0.365332073308618, 0.257204756311451, 0.46421415718266706], [-0.55795462205606, -0.528920209655739, -1.01193560142206], [-0.8349912505... | [0.13501098545634, 0.8436525885969681, 0.633953784861439, -0.09832009167866321, -0.0583454827760237, -0.148099371736581] | 108 |
15 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -349.861266 | [[-0.735834944299172, 0.16983472103263894, 0.30896574018502293], [-0.56617377829579, -0.06252047071209063, -0.18361928260349702], [-0.326718424112994, -0.721001479459434, 0.09705913082936173], [0.... | [0.446586866327166, 0.13068019034171802, -0.0693758990272317, -0.263866851749946, 0.10925215678469699, 0.13607792544806202] | 108 |
16 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -352.007585 | [[0.301178602933076, 0.258002276944641, -0.27143156569769294], [-0.298930228924858, -0.03075712133705462, -0.364812391605405], [-0.797427142712701, -0.74286906074633, -0.4273785140338071], [0.0727... | [-0.757934402614806, -0.614667264232027, -0.667927922027262, -0.20235825539995303, -0.188727963861254, 0.38752048478086604] | 108 |
17 | lammps_job | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -350.590328 | [[0.224009094047, -0.839607604551706, 0.777637424807001], [0.694708728593008, -0.01978842244270086, -0.563089538819962], [-0.0726046276015651, 0.180208519203361, 0.157259401253289], [0.11284631102... | [-0.57750724810661, -0.330705291904955, -0.24897277922262204, 0.0035474383929712198, 0.217652379540492, -0.372803571664775] | 108 |
18 | lammps_job_vac | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -290.793066 | [[-3.1974423109204496e-14, -0.5904705706758219, -0.590470570675822], [-0.5904705706758229, -3.325957990662624e-14, -0.590470570675822], [-0.5904705706758219, -0.590470570675821, -3.642777903722078... | [58.87969078759508, 58.65874845858828, 58.50796880315438, -0.047554178778302515, -0.33314286846391333, 0.07410497611298358] | 107 |
19 | lammps_job_surf | [element: [None, AtomicNumber 13\nAtomicRadius 118.0\nAtomicMass 26.981539\nColor Silver\nCovalentRadius ... | -428.609075 | [[2.44249065417534e-15, 4.56905929757667e-10, 0.314474097336679], [-9.29811783123569e-16, 4.56905354696835e-10, 0.314474097336678], [2.7611647777231502e-15, 4.56905929757667e-10, 0.314474097336678... | [-0.693731819636784, -0.575956660364361, -0.802656043307567, -0.0667956150106202, -0.145663378134875, -0.0209963500927039] | 128 |
We can now inspect the data in this dataset quite easily
struct = dataset.get_structure(10)
struct.plot3d()
dataset.plot.energy_volume();
dataset.plot.forces()
The datasets used in the potential fitting procedure for day 2 (obtained from accurate DFT calculations) will be accessed in the same way.
Extra Credit#
Add more interesting structures. Ideas:
Dimer, trimers
Cleaving of a bulk structure, i.e. create a super cell and separate the atoms along a chosen plane
high or low pressure MD
Different crystal structures
…