Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
1Welcome!
Mass Spectrometry meets Cheminformatics
Tobias Kind and Julie Leary
UC Davis
Course 1: General Introduction
Class website: CHE 241 - Spring 2008 - CRN 16583
Slides: http://fiehnlab.ucdavis.edu/staff/kind/Teaching/
PPT is hyperlinked – please change to Slide Show Mode
2What is ChemInformatics?
Chemistry
Statistics
Informatics
Mathematics
Chemometrics est. 1975
Cheminformatics est. 1998
3Who uses Cheminformatics?
All parts of chemistry heavily depend on cheminformatics.
Life sciences, biochemistry, drug industries use cheminformatics.
20 years ago: 80% in lab – 20% in front of computer
Now: 20% in lab  - 70% in front of computer (*)
Examples:
• Organic chemistry – automated reaction planning, Beilstein search
• Physical chemistry – modeling of structure properties (boiling points)
• Inorganic chemistry – ligand bond interactions
• Analytical chemistry – structure elucidation of small compounds
• Biochemistry – protein/small molecule interaction networks
PhD
(*) 10% fixing and installing new programs
4Motivation for Mass Spectrometry meets ChemInformatics  
To be a master of spectra you need to be a master of structures in the first place.
(nist_msms) Vincristine
260 310 360 410 460 510 560 610 660 710 760 810
0
50
100
265 353 395 455 513 538
604
636
676
705
723
747
765
807
NH
O
O
N
OH
HO
ON
O
O
N
O
O
O
Æ Complex MS data interpretations only possible with software
Æ MS data obtained by hyphenated techniques (GC-MS, LC-MS)
Æ Mass spectral database search and structure search routinely are used
Æ Mass spectrometers deliver multidimensional data
5Computer Illiteracy – a threat to your research
Your computer is your friend
You don’t have a computer? You don’t have a friend (just kidding)
• Assume you have a computer:
Please step forward name: CPU, speed, memory, hard disk, OS
• You are a chemist, biochemist, biologist:
Please step forward name: Computer language or DB you know
OS = operating system; DB = database, CPU = central processing unit
PDP-11 www.bell-labs.com
6Fighting Computer Illiteracy - name your PC 
CPU INTEL,AMD,IBM,HP Pentium, Opteron, Core Duo 2-3 Ghz
Memory GEIL, KINGSTON DDR, DDR2 1-8 GByte
Hard disk SEAGATE, WD Raptor, Barracuda, Cheetah 100-1000 GByte
OS MICROSOFT, LINUX Windows, Linux, OSX, Virtual OS
Language C, Basic, Perl, JAVA
Bit < Byte < kByte < MByte < GByte
Single Core < Dual Core < QuadCore < MultiCore
MFLOP/s < GFLOP/s < TFLOP/s < PFLOP/s
1 Thread < Dual Thread < MultiThreaded
Cray 2 in rot, Nixdorfmuseum, 2004, 
7Computer Illiteracy – learn a programming language
Why should you?
20% lab time – 80% computer time
Mass spectrometers deliver data – not results
Why shouldn't you? (fake reasons)
You are too old to learn…
You are not good with computers…
Your have more important research to do…
You are so rich you have programmers who work for you…
Picture Source: WIKI James Manners from Genova, Italia 
8Computer Illiteracy – learn a programming language
• Learn any language which has a large code and user base (JAVA, Perl, Visual Basic)
• Use IDEs with automatic code completion like MS Visual Express or Eclipse
• Don’t re-invent code - use (and document) code search engines like
koders.com; 
google.com/codesearch
krugle.com
moOMoOMoOMoOMoOmoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMo
OMMMmoOMMMMoOMoOMoOMoOMoOMoO
MoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMo
OMMMmoOMMMommMoOMoOMoOMoOMoO
MoOMoOMoOMoOMoOMoOMoOMMMmoOMMMMoOMoOMMMmoOMMM
MoOMoOMoOMoOMoOMoOMoOMoOMoOMoO
MoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMoOMo
OMoOMoOMoOMoOMoOMoOMoOMoOMoO
Language “cow” Language “brainfuck”
Do *not* learn these working but esoteric languages
There are 1123 programming languages http://99-bottles-of-beer.net/
>>++++++++[<++++>-]               
   
  
>++++++++++++++[<+++++++>-]    
+>+++++++++++[<++++++++++>-]    
++>+++++++++++++++++++[<++++++>-] 
++>+++++++++++++++++++[<++++++>-] 
>++++++++++++[<+++++++++>-]       
9Program development – Eclipse for JAVA example
Projects
JAVA or C code
Text output
10
Computer Illiteracy – your emergency helpers
Regular expressions; SQL database requests; EXCEL VBA scripts or Perl scripts 
are special tools for data handling (Swiss army knifes) 
Regular expressions (RegEx) are used for finding and replacing text
[0-9] – represents all numbers Examples: \n\n – find double empty lines
[a-z] – represents all small letters find \t replace with spaces “ “
\n – represents new line (CR/LF) find two numbers in brackets ([0-9][0-9])
\t – represents TAB
yr subject winner
1901 Chemistry Jacobus H. van 't Hoff
1902 Chemistry Emil Fischer
1903 Chemistry Svante Arrhenius
1904 Chemistry Sir William Ramsay
1905 Chemistry Adolf von Baeyer
1906 Chemistry Henri Moissan
1907 Chemistry Eduard Buchner
1908 Chemistry Ernest Rutherford
1909 Chemistry Wilhelm Ostwald
1910 Chemistry Otto Wallach
1913 …
SELECT yr, subject, winner
FROM nobel
WHERE yr = 1909 and 
subject = 'chemistry'
yr subject winner
1909 Chemistry Wilhelm Ostwald
Large Database Table SQL query Result
Visit the SQL Zoo
SQL is used for programming databases
Learn about RegEx
11
Regular Expressions – example MS data
Task: create a list of 4 columns with names, formulas, CAS numbers and peaks
Problem: 24,000 lines of mass spectral data (*.msp)
Program: Textpad (WIN), Smultron (Mac)
Number of lines in text
(mainlib) 2,5-P yrrolidinedione, 1-methyl-3-phenyl-
10 30 50 70 90 110 130 150 170 190
0
50
100
14 28 39
51
63
78
89
104
117 131 160
189
O
N
O
(m/z - intensity pair)
Enter (CR/LF) in gray
12
Regular Expressions – example MS data 
Solution: replace Enter (\n) with TAB (\t) and use Replace ALL
13
Regular Expressions – example MS data 
Solution: copy only lines of interest (Mark ALL – Copy Bookmarked Lines)
14
Regular Expressions – Result for MS data 
Solution: Replace redundant code with nothing, copy tab separated file to EXCEL
Result: 1:30 min for RegEx job
(1 hour manually?) 
Average spectrum size: 70 peaks
Minimum size: 5 peaks
Maximum size: 439 peaks
Most spectra have 35 and 45 peaks
15Try Marvin Space via Webstart
Be prepared – visualize your structures
16
Be prepared - StereoIsomers
How many stereoisomers can you expect from glucose (KEGG)?
Example: separation of species with ion mobility MS (FAIMS)
Example calculated with MarvinView (via JAVA Webstart)
O
HO
HO
OH
OH
OH
Glucose
17
Be prepared – Resonance (electron shifts)
What are possible resonant structures?
Important for mass spectral interpretation (electron impact, electrospray)
OH
Phenol
Example calculated with MarvinView Start via WebStart
18
Be prepared – Tautomers
How many tautomers can you expect?
Important for mass spectral interpretations.
H3C O
O
CH3
Methyl acetate
Example calculated with MarvinView Start via WebStart
19
Mass spectral database search – know what exists
How many mass spectra with formula C11H8O3 in NIST DB?
Result: 19 for C11H8O3 in NIST05 DB
Download NIST-MS-Search
20
Mass spectral interpretation
Assign structural elements to mass spectral peaks
Download Mass Spectrum Interpreter Version 2
21
Structure search – know what could be possible
How many compounds (isomer structures) are found in 
public databases?
Result:
272 for C11H8O3
http://www.chemspider.com/
22
Molecular Weight Calculator
522.00 524.00 526.00 528.00 530.00 532.00
0.0
20.0
40.0
60.0
80.0
100.0
Calculate isotopic masses
Find formulas from masses
Calculate isotopic patterns
Download MWTWIN
23
Stay tuned – new mass spectrometry publications
via Yahoo Pipes
[LINK]
[RSS]
24
The Last Page - What is important to remember:
Learn about CPU type, memory, hard disks, bits and bytes;
Æ shock you colleagues with random questions about their computer
Think about automation, thinks you would like to do (even if you can’t)
Æ shock you colleagues with a small computer script
Use regular expressions for stupid or boring jobs
Æ you delete/replace data more than 3x - remember RegEx, RegEx, Regex
Use scripting languages for small problems (EXCEL VBA, PERL)
Æ steal some small examples and color your EXCEL data in rainbow color
Generate yourself a collection of programs and databases for MS
Æ try such programs in a Virtual Machine without messing up your system
25
Tasks:
The PowerPoint slides are all hyperlinked.
1) Download and install the mentioned tools (JAVA required)
2) Visit the databases and online websites
3) Repeat shown examples
4) Check notes in PPT for additional information
26
Literature:
Check notes and links in PPT
27
Links:
Used for research: (right click – open hyperlink)
• http://www.google.com/search?hl=en&q=Computer+Illiteracy++site%3A.nsf.gov&btnG=Search
• http://www.computerhistory.org/microprocessors/
• http://www.google.com/search?hl=en&q=holy+crap+site%3A.edu&btnG=Search
• http://allendowney.com/essays/complaints.html
• http://www.google.com/search?hl=en&q=editor+for+mac+regular+expressions&btnG=Search
• SQL learning http://sqlzoo.net/
• Virtual Machine for MAC http://www.parallels.com/en/shop/online/
(run WINDOWS and LINUX on an INTEL MAC
• http://www.microsoft.com/windows/products/winfamily/virtualpc/default.mspx
(Virtual PC or VMWare - run multiple WINDOWS or LINUX under WIN or vice versa)
Of general importance for this course:
http://fiehnlab.ucdavis.edu/staff/kind/Metabolomics/Structure_Elucidation/