browse before

- Appendix AI Administrative Instructions Under the PCT

Nucleotide Sequences

Symbols to Be Used

8. A nucleotide sequence shall be presented only by a single strand, in the 5"-end to 3"-end direction from left to right. The terms 3" and 5" shall not be represented in the sequence.

9. The bases of a nucleotide sequence shall be represented using the one-letter code for nucleotide sequence characters. Only lower case letters in conformity with the list given in Appendix 2, Table 1, shall be used.

10. Modified bases shall be represented as the corresponding unmodified bases or as "n" in the sequence itself if the modified base is one of those listed in Appendix 2, Table 2, and the modification shall be further described in the feature section of the sequence listing, using the codes given in Appendix 2, Table 2. These codes may be used in the description or the feature section of the sequence listing but not in the sequence itself (see also paragraph 32). The symbol "n" is the equivalent of only one unknown or modified nucleotide.

Format to be Used

11. A nucleotide sequence shall be listed with a maximum of 60 bases per line, with a space between each group of 10 bases.

12. The bases of a nucleotide sequence (including introns) shall be listed in groups of 10 bases, except in the coding parts of the sequence. Leftover bases, fewer than 10 in number at the end of non-coding parts of a sequence, should be grouped together and separated from adjacent groups by a space.

13. The bases of the coding parts of a nucleotide sequence shall be listed as triplets (codons).

14. The enumeration of the nucleotide shall start at the first base of the sequence with number 1. It shall be continuous through the whole sequence in the direction 5" to 3". It shall be marked in the right margin, next to the line containing the one-letter codes for the bases, and giving the number of the last base of that line. The enumeration method for nucleotide sequences set forth above remains applicable to nucleotide sequences that are circular in configuration, with the exception that the designation of the first nucleotide of the sequence may be made at the option of the applicant.

15. A nucleotide sequence that is made up of one or more non-contiguous segments of a larger sequence or of segments from different sequences shall be numbered as a separate sequence, with a separate sequence identifier. A sequence with a gap or gaps shall be numbered as a plurality of separate sequences with separate sequence identifiers, with the number of separate sequences being equal in number to the number of continuous strings of sequence data.

browse after