How to access the UCSC genome database using MySQL

  1. Get and install MySQL (scroll down to get the Essentials for the right OS). In the installation, toggle the option to adjust the PATH on.
  2. Create a bat file (e.g. mySQL_genome.bat) with the line:

    mysql -h genome-mysql.cse.ucsc.edu -u genome hg18

  3. Create a .sql file (e.g. input.sql) with the query. Examples from a reply by Jennifer Jackson on the UCSC mailing list, for returning brain-Expressed Sequence Tags:

    select * from chr10_intronEst,gbCdnaInfo,tissue where gbCdnaInfo.acc = chr10_intronEst.qName and tissue.id = gbCdnaInfo.tissue and tissue.name like "%brain%";

    select tName,tStart,tEnd,qName,name from all_mrna,gbCdnaInfo,tissue where tName = "chr3" and tEnd > 37009983 and tStart < 37067341 and gbCdnaInfo.acc = qName and tissue.id = gbCdnaInfo.tissue and tissue.name like "%brain%";

  4. You can use multiple lines, but be careful not to overdo it: you're only allowed a certain amount of program-generated queries and you can lose access for causing extreme loads.
  5. Now use the command line (type "cmd" in the Srtart menu search box), go to the directory with the files, and type:
    mySQL_genome < input.sql > output.txt

    This starts mysql using the connection parameters in the bat file, then feeds it the command(s) in the text file. The output is sent to output.txt