How to access the UCSC genome database using MySQL
- Get and install MySQL (scroll down to get the Essentials for the right OS). In the installation, toggle the option to adjust the PATH on.
- Create a bat file (e.g. mySQL_genome.bat) with the line:
mysql -h genome-mysql.cse.ucsc.edu -u genome hg18
- Create a .sql file (e.g. input.sql) with the query. Examples from a reply by Jennifer Jackson on the UCSC mailing list, for returning brain-Expressed Sequence Tags:
select * from chr10_intronEst,gbCdnaInfo,tissue where gbCdnaInfo.acc = chr10_intronEst.qName and tissue.id = gbCdnaInfo.tissue and tissue.name like "%brain%";
select tName,tStart,tEnd,qName,name from all_mrna,gbCdnaInfo,tissue where tName = "chr3" and tEnd > 37009983 and tStart < 37067341 and gbCdnaInfo.acc = qName and tissue.id = gbCdnaInfo.tissue and tissue.name like "%brain%";
- You can use multiple lines, but be careful not to overdo it: you're only allowed a certain amount of program-generated queries and you can lose access for causing extreme loads.
- Now use the command line (type "cmd" in the Srtart menu search box), go to the directory with the files, and type:
mySQL_genome < input.sql > output.txt
This starts mysql using the connection parameters in the bat file, then feeds it the command(s) in the text file. The output is sent to output.txt