Embedding chemical information in image files
A D V E R T I S E M E N T
Rich Apodaca has been discussing
embedding molecular information in images of molecules, such as a PNG file
depicting a 2D structure. As we move to a more web-centric view of the world it
is apparent that much of research information will be only available via the
web, whilst images of chemical structures are usually adequate for a human
viewer the chemical structure cannot be indexed and subsequently searched. In
the previous tutorial. I showed how to use applescript to extract the information from
the PNG file and then display the structure in a couple of chemical display
packages in an editable form.
A couple of people have asked for a tool to embed metadata into images and
this script shows how to add SMILES string to a PNG file using ChemBioDraw.
This script again relies on the excellent ExifTool by Phil Harvey.
ExifTool is a platform-independent Perl library plus a command-line application
for reading, writing and editing meta information in images. To allow the
addition of custom tags you need to edit and install the exiftool configuration
file.
The file and full instructions can be downloaded from the ExifTool website but
you need to download the source not the MacOSX binary. To save you time you can
just download the file
The part of the file that adds the tags is shown below. This new
configuration file will allow you to add metadata for SMILES, molfile and sdf to
PNG files.
# The %Image::ExifTool::UserDefined hash defines new tags to be added
# to existing tables.
%Image::ExifTool::UserDefined = (
# new PNG tags are added to the PNG::TextualData table:
'Image::ExifTool::PNG::TextualData' => {
SMILES => { },
molfile => { },
sdf => { },
},
);
The downloaded file needs to be installed in the same folder as exiftool
which if you installed it using the binary installer will be usr/bin. You will
need to rename the file to .ExifTool_config (note the leading period!). This is
easist to do using the Terminal. You will need the admin password.
sudo mv /Users/Chris/Desktop/ExifTool_config /usr/bin/.ExifTool_config
If you have installed ExifTool anywhere else you will need to install the
configuration file in the appropriate place. You also need OpenBabel and the
easiest way to install is to install the ChemSpotlight.
First draw the structure in Chemdraw and save it as a PNG, ignore the warning
that chemical information will be lost, we are going to embed it in the metadata
:-). Leave ChemBioDraw open with the structure in place.
We then tall ChemBioDraw to get the SMILES string, with a little checking to
make sure a structure is selected. The next part uses OpenBabel to generate a
canonical SMILES. The last part creates the shell script telling ExifTool to
embed the metadata into the selected image file.
The first part of the script below simply asks the user to select the image
file that you want to add chemical metaadata to, it then generates the POSIX
path to the file since ExifTool requires UNIX style paths. ExifTool will add the
metadata and also create a copy of the original PNG file at the same location.
You can check the metadata has been installed using the ExifTool command.
exiftool -v path to imagefile
If you install the applescript in the folder
Applications:CS ChemOffice 2008:CS ChemDraw:ChemDraw Items:
The next time you start up ChemDraw there will be a 'Scripts" menu in the top
menu bar and you will be able to access it from within ChemDraw.
The applescript to embed chemical metadata
set theFile to (choose file with prompt "Select the PNG file to add metadata:") as alias
set the_posix_file to POSIX path of theFile
tell application "CS ChemBioDraw Ultra"
if not (enabled of menu item "copy") then
do menu item "Select All" of menu "Edit"
set the_SMILES to SMILES of selection
else
set the_SMILES to SMILES of selection
end if
end tell
--use openbabel to get canonical SMILES
set the_ob_script to "echo '" & the_SMILES & "' | /usr/local/bin/babel -ismi -osmi"
try
set ob_smiles to (do shell script the_ob_script)
end try
--display dialog ob_smiles
set theScript to "exiftool -SMILES=\"" & ob_smiles & "\" " & the_posix_file
do shell script theScript
We obviously would want to add other file formats to the metadata but
unfortunately there is a bug in ChemBioDraw such that you cannot script saving
as other file formats.
|