This website contains information on obtaining the whole genome shotgun sequence of the Cannabis Sativa cultivar "Chemdawg." The data is provided by Medicinal Genomics with the help of Nimbus Informatics. Academic use is free of charge but Amazon EC2 costs are the responsibility of the user. If you are a commercial enterprise please contact Medicinalgenomics@gmail.com for a commercial license.
The sequence data is derived from an ILMN HiSeq v2.0 chemistry with 2x100 reads. There are 7 Lanes in total which add up to 131Gb of sequence. Quality statistics for the run can be found at here. The genome is estimated to be 400Mb thus an estimated 327X coverage.
There are several ways in which we anticipate people will want to use this data:
If improvements are made to the assembly or variant calls we ask people post those to Amazon in public EBS volumes and send a note to Medicinalgenomics@gmail.com so we can link to your improvements from our website.
We have made a preliminary assembly available via S3:
We have removed the direct download links for the fastq files in favor of the public EBS snapshot as a distribution mechanism for the C. Sativa genome. You can create your own EBS volume from the snapshot "snap-f8af5298", please see the public dataset page hosted by Amazon here. For more information on using an EBS volume snapshot please see the documentation provided by Amazon here.
September 2, 2011
Please note, we have removed the direct download links. Use the public EBS snapshot to access the sequence data instead. You can get more information by going to the dataset page.
August 27, 2011
Amazon is now hosting the C. sativa genome snapshot as a public dataset. You can get more information by going to the dataset page. The snapshot ID is still "snap-f8af5298" and remains unchanged from the version linked to previously.
August 18, 2011
Today we posted the fastq files for download and made an EBS snapshot of the data available on Amazon.