Difference between revisions of "Matgen toolkit"
(→Remove Solvents) |
(→Descriptions) |
||
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Descriptions == | == Descriptions == | ||
+ | Matgen toolkit is a collection of Material Gene Engineering (MGE) sofrware written in C++. The core modules include the followings: | ||
+ | # Remove Solvents | ||
+ | # Find Space Groups | ||
+ | # In-Cell | ||
+ | # ICSD' Classify And Unique | ||
+ | # CSD' Classify | ||
+ | # Format | ||
+ | # Splice Molecule | ||
+ | |||
=== Remove Solvents === | === Remove Solvents === | ||
Line 30: | Line 39: | ||
=== Find Space Groups === | === Find Space Groups === | ||
+ | ==== Description ==== | ||
+ | The program is used to obtain the space group information in the cif file.(base on spglib) | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/find_space_groups --input=string [options] ... | ||
+ | options: | ||
+ | -i, --input input cif file name (string) | ||
+ | -v, --version return the version of spglib | ||
+ | -w, --why this method is used to see roughly why spglib failed | ||
+ | -s, --spacegroup internatioanl space group short symbol and number are obtained as a string | ||
+ | -m, --symmetry symmetry operations are obtained as a dictionary | ||
+ | -r, --refine standardized crystal structure is obtained as a tuple of lattice (a 3x3 numpy array), atomic scaled positions (a numpy array of [number_of_atoms,3]), and atomic numbers (a 1D numpy array) that are symmetrized following space group type. | ||
+ | -p, --primitive is found, lattice parameters (a 3x3 numpy array), scaled positions (a numpy array of [number_of_atoms,3]), and atomic numbers (a 1D numpy array) is returned. | ||
+ | -d, --dataset dataset,cell and symprec;angle_tolerance;hall_number;number;choice;transformation_matrix;origin shift;wyckoffs;site_symmetry_symbols;equivalent_atoms;mapping_to_primitive;rotations and translations;pointgroup;std_lattice;std_positions;std_types;std_rotation_matrix;std_mapping_to_primitive | ||
+ | -c, --symmfdset A set of crystallographic symmetry operations corresponding to hall_number is returned by a dictionary where rotation parts and translation parts are accessed by the keys rotations and translations, respectively. | ||
+ | -f, --spgfdset This function allows to directly access to the space-group-type database in spglib (spg_database.c). A dictionary is returned. To specify the space group type with a specific choice, hall_number is used. | ||
+ | -n, --niggli Niggli reduction is achieved using this method. | ||
+ | -l, --delaunay Delaunay reduction is achieved using this method. | ||
+ | -k, --irrkpoints Irreducible k-points are obtained from a sampling mesh of k-points | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | <pre> | ||
+ | $ ./bin/find_space_groups -i ./examples/cod/WAJZUE.cif -s | ||
+ | Parsing the cif file - ABETIN_clean.cif | ||
+ | Getting some known resources... | ||
+ | The space group is: I4_1/acd 142 | ||
+ | </pre> | ||
+ | |||
+ | ==== Problem && Solution ==== | ||
+ | error while loading shared libraries: libsymspg.so.1: cannot open shared object file: No such file or directory because the program will default to /lib64/libsymspg.so not in /lib64/. Therefore, the following commands need to be added to allow the program to find the library in the directory of the instruction. | ||
+ | <pre> | ||
+ | export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./include/spglib/_build | ||
+ | </pre> | ||
+ | spglib path please change according to the actual | ||
+ | |||
=== In-Cell === | === In-Cell === | ||
− | === ICSD' Classify And Unique ==== | + | ==== Description ==== |
+ | This program will obtain its in-cell structure based on the space group information of the unit cell. | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/in_cell --input_path=string --output_path=string [options] ... | ||
+ | options: | ||
+ | -i, --input_path input MOF cif file (string) | ||
+ | -o, --output_path output file path (string) | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | <pre> | ||
+ | $ ./bin/in_cell -i ./examples/cod/WAJZUE.cif -o ./examples/result | ||
+ | Getting some known resources... | ||
+ | Parsing the cif file - WAJZUE.cif | ||
+ | Exporting in-cell result... | ||
+ | Export file ./examples/result/WAJZUE_in_cell.cif successfully! | ||
+ | </pre> | ||
+ | |||
+ | === ICSD' Classify And Unique === | ||
+ | ==== Description ==== | ||
+ | The program classifies ICSD cif files and removes duplicate files. | ||
+ | classification rules - component/element type/space group/ | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/ICSD_classify --input_dir=string --output_dir=string [options] ... | ||
+ | options: | ||
+ | -i, --input_dir icsd folder location (string) | ||
+ | -o, --output_dir classification result export location (string) | ||
+ | -l, --log print the detail log, no log by default | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | <pre> | ||
+ | $ ./bin/ICSD_classify -i ./examples/icsd -o ./examples/icsd_classify | ||
+ | </pre> | ||
+ | |||
=== CSD' Classify === | === CSD' Classify === | ||
+ | ==== Description ==== | ||
+ | This program is used to remove files containing metal elements, disorder molecules and known solvents from the CSD database. You can specify to exclude only certain metal elements. The result may contain two folders, the folder csd_warning indicates that the atoms in the structure are bonded to two or more parts, and the folder csd_normal indicates that the atoms in the structure will only be bonded to one part. | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/csd_classify --input_dir=string --output_dir=string [options] ... | ||
+ | options: | ||
+ | -i, --input_dir csd folder location (string) | ||
+ | -o, --output_dir classification result export location (string) | ||
+ | -r, --remove only remove the cif which contains special elements or special bonds(the input form likes special meatal/special bonds(Fe|Cu/Fe-O|C-O&C-H) or only input one of them, please use '/' as separators for elements and bonds) (string [=]) | ||
+ | -k, --keep only keep the cif which contains special elements and special bond(the input form likes special meatal/special bonds(Fe|Cu/Fe-O|C-O&C-H) or only input one of them, please use '/' as separators for elements and bonds (string [=]) | ||
+ | -l, --log print the detail log, no log by default | ||
+ | -u, --unique remove duplicate files | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | <pre> | ||
+ | ./bin/CSD_classify -i ./examples/csd -o ./examples/csd_classify | ||
+ | </pre> | ||
+ | |||
=== Format === | === Format === | ||
+ | ==== Description ==== | ||
+ | The program is used to convert the cif file into a file of another format. It supports customizing the atomic coordinates (atomic fractional coordinates / cartesian coordinates) in the conversion result and converting the molecular structure to in-cell or asymmetric mode. | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/format --input=string --output=string --type=string [options] ... | ||
+ | options: | ||
+ | -i, --input input file (string) | ||
+ | -o, --output output path of the conversion result (string) | ||
+ | -m, --mode the mode of the format conversion(in-cell/asymmetric) (string [=asymmetric]) | ||
+ | -t, --type the type of the result format(gjf/vasp), convert format to vasp file format or gaussion format (string) | ||
+ | -c, --coord_type the type of the coordinate(fract/cart), the coordinates of the atom in the conversion result are fractional coordinates or cartesian coordinates (string [=fract]) | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | convert to gaussion file | ||
+ | <pre> | ||
+ | $ ./bin/format -i ./examples/cod/WAJZUE.cif -o ./examples/result -t gjf | ||
+ | </pre> | ||
+ | convert to vasp file | ||
+ | <pre> | ||
+ | $ ./bin/format -i ./examples/cod/WAJZUE.cif -o ./examples/result -t gjf | ||
+ | </pre> | ||
+ | |||
=== Splice Molecule === | === Splice Molecule === | ||
+ | ==== Description ==== | ||
+ | The program is used to splice A and B molecules according to specified atoms. | ||
+ | ==== Usage ==== | ||
+ | <pre> | ||
+ | usage: ./bin/splice_molecule --molecule_a=string --molecule_b=string --output=string --type=string --connect_a=int --connect_b=int [options] ... | ||
+ | options: | ||
+ | -a, --molecule_a path of the molecule A (string) | ||
+ | -b, --molecule_b path of the molecule B (string) | ||
+ | -o, --output the output path (string) | ||
+ | -t, --type the type of the result format(gjf/xyz), convert format to gaussion format or xyz format (string) | ||
+ | -i, --connect_a the serial number of connect site in molecule A (int) | ||
+ | -j, --connect_b the serial number of connect site in molecule B (int) | ||
+ | -?, --help print this message | ||
+ | </pre> | ||
+ | ==== Example ==== | ||
+ | <pre> | ||
+ | $ ./bin/splice_molecule -a ./examples/mol/molecule-A-label.mol -b ./examples/mol/molecule-B-label.mol -i 31 -j 7 -t gjf -o ./examples/mol | ||
+ | </pre> |
Latest revision as of 13:34, 14 February 2020
Contents
Descriptions
Matgen toolkit is a collection of Material Gene Engineering (MGE) sofrware written in C++. The core modules include the followings:
- Remove Solvents
- Find Space Groups
- In-Cell
- ICSD' Classify And Unique
- CSD' Classify
- Format
- Splice Molecule
Remove Solvents
Description
The program is a tool to remove solvents from MOF.
Usagen
usage: ./bin/rm_mof_solvents --cif_in=string [options] ... options: -i, --cif_in input MOF cif file (string) -o, --output_path output filepath (string [=]) -f, --force remove solvent molecules anyway -?, --help print this message
Example
$ ./bin/rm_mof_solvents -i ./examples/cod/ABAGAO.MOF_subset.cif -o ./examples/result Parsing the cif file - ABAGAO.MOF_subset.cif Getting some known resources... Building base cell... The number of bonded atom pairs is 80 Looking for solvent in ABAGAO.MOF_subset.cif The calculated solvent molecule to be screened is [ H2O<known> ] The MOF framework is [ C14CuH13N3O4 ] Exporting result... Export file ./example/result/ABAGAO_clean.cif successfully!
Find Space Groups
Description
The program is used to obtain the space group information in the cif file.(base on spglib)
Usage
usage: ./bin/find_space_groups --input=string [options] ... options: -i, --input input cif file name (string) -v, --version return the version of spglib -w, --why this method is used to see roughly why spglib failed -s, --spacegroup internatioanl space group short symbol and number are obtained as a string -m, --symmetry symmetry operations are obtained as a dictionary -r, --refine standardized crystal structure is obtained as a tuple of lattice (a 3x3 numpy array), atomic scaled positions (a numpy array of [number_of_atoms,3]), and atomic numbers (a 1D numpy array) that are symmetrized following space group type. -p, --primitive is found, lattice parameters (a 3x3 numpy array), scaled positions (a numpy array of [number_of_atoms,3]), and atomic numbers (a 1D numpy array) is returned. -d, --dataset dataset,cell and symprec;angle_tolerance;hall_number;number;choice;transformation_matrix;origin shift;wyckoffs;site_symmetry_symbols;equivalent_atoms;mapping_to_primitive;rotations and translations;pointgroup;std_lattice;std_positions;std_types;std_rotation_matrix;std_mapping_to_primitive -c, --symmfdset A set of crystallographic symmetry operations corresponding to hall_number is returned by a dictionary where rotation parts and translation parts are accessed by the keys rotations and translations, respectively. -f, --spgfdset This function allows to directly access to the space-group-type database in spglib (spg_database.c). A dictionary is returned. To specify the space group type with a specific choice, hall_number is used. -n, --niggli Niggli reduction is achieved using this method. -l, --delaunay Delaunay reduction is achieved using this method. -k, --irrkpoints Irreducible k-points are obtained from a sampling mesh of k-points -?, --help print this message
Example
$ ./bin/find_space_groups -i ./examples/cod/WAJZUE.cif -s Parsing the cif file - ABETIN_clean.cif Getting some known resources... The space group is: I4_1/acd 142
Problem && Solution
error while loading shared libraries: libsymspg.so.1: cannot open shared object file: No such file or directory because the program will default to /lib64/libsymspg.so not in /lib64/. Therefore, the following commands need to be added to allow the program to find the library in the directory of the instruction.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./include/spglib/_build
spglib path please change according to the actual
In-Cell
Description
This program will obtain its in-cell structure based on the space group information of the unit cell.
Usage
usage: ./bin/in_cell --input_path=string --output_path=string [options] ... options: -i, --input_path input MOF cif file (string) -o, --output_path output file path (string) -?, --help print this message
Example
$ ./bin/in_cell -i ./examples/cod/WAJZUE.cif -o ./examples/result Getting some known resources... Parsing the cif file - WAJZUE.cif Exporting in-cell result... Export file ./examples/result/WAJZUE_in_cell.cif successfully!
ICSD' Classify And Unique
Description
The program classifies ICSD cif files and removes duplicate files. classification rules - component/element type/space group/
Usage
usage: ./bin/ICSD_classify --input_dir=string --output_dir=string [options] ... options: -i, --input_dir icsd folder location (string) -o, --output_dir classification result export location (string) -l, --log print the detail log, no log by default -?, --help print this message
Example
$ ./bin/ICSD_classify -i ./examples/icsd -o ./examples/icsd_classify
CSD' Classify
Description
This program is used to remove files containing metal elements, disorder molecules and known solvents from the CSD database. You can specify to exclude only certain metal elements. The result may contain two folders, the folder csd_warning indicates that the atoms in the structure are bonded to two or more parts, and the folder csd_normal indicates that the atoms in the structure will only be bonded to one part.
Usage
usage: ./bin/csd_classify --input_dir=string --output_dir=string [options] ... options: -i, --input_dir csd folder location (string) -o, --output_dir classification result export location (string) -r, --remove only remove the cif which contains special elements or special bonds(the input form likes special meatal/special bonds(Fe|Cu/Fe-O|C-O&C-H) or only input one of them, please use '/' as separators for elements and bonds) (string [=]) -k, --keep only keep the cif which contains special elements and special bond(the input form likes special meatal/special bonds(Fe|Cu/Fe-O|C-O&C-H) or only input one of them, please use '/' as separators for elements and bonds (string [=]) -l, --log print the detail log, no log by default -u, --unique remove duplicate files -?, --help print this message
Example
./bin/CSD_classify -i ./examples/csd -o ./examples/csd_classify
Format
Description
The program is used to convert the cif file into a file of another format. It supports customizing the atomic coordinates (atomic fractional coordinates / cartesian coordinates) in the conversion result and converting the molecular structure to in-cell or asymmetric mode.
Usage
usage: ./bin/format --input=string --output=string --type=string [options] ... options: -i, --input input file (string) -o, --output output path of the conversion result (string) -m, --mode the mode of the format conversion(in-cell/asymmetric) (string [=asymmetric]) -t, --type the type of the result format(gjf/vasp), convert format to vasp file format or gaussion format (string) -c, --coord_type the type of the coordinate(fract/cart), the coordinates of the atom in the conversion result are fractional coordinates or cartesian coordinates (string [=fract]) -?, --help print this message
Example
convert to gaussion file
$ ./bin/format -i ./examples/cod/WAJZUE.cif -o ./examples/result -t gjf
convert to vasp file
$ ./bin/format -i ./examples/cod/WAJZUE.cif -o ./examples/result -t gjf
Splice Molecule
Description
The program is used to splice A and B molecules according to specified atoms.
Usage
usage: ./bin/splice_molecule --molecule_a=string --molecule_b=string --output=string --type=string --connect_a=int --connect_b=int [options] ... options: -a, --molecule_a path of the molecule A (string) -b, --molecule_b path of the molecule B (string) -o, --output the output path (string) -t, --type the type of the result format(gjf/xyz), convert format to gaussion format or xyz format (string) -i, --connect_a the serial number of connect site in molecule A (int) -j, --connect_b the serial number of connect site in molecule B (int) -?, --help print this message
Example
$ ./bin/splice_molecule -a ./examples/mol/molecule-A-label.mol -b ./examples/mol/molecule-B-label.mol -i 31 -j 7 -t gjf -o ./examples/mol