XYs is a simple web tool to look for polyXY regions. These are compositionally biased regions of two amino acids. The order of the two different amino acids, along with the periodicity of the repeated XY units, are crucial factors to categorize these regions. PolyXY regions can be be direpeats, or can be degenerated (impure) polyX, depending on periodicity and residue order. We define three categories of polyXY regions:
  1. Direpeats; example: 'XYXYXY'. Unit "XY" is repeated 'n' times.
  2. Joined; example: 'XXXYYY'. First amino acid "X" is repeated, then amino acid "Y", but they are not mixed.
  3. Shuffled; example: 'XYYXYX'. Amino acids do not apparently follow an order.
In our web tool, you only need a protein dataset in FASTA format to locate the polyXY regions. Several optional thresholds can be modified to filter out the polyXY to be obtained, based on their category, amino acids forming them, or number of repeated units.

A) Precomputed results for complete reference proteomes

Unique proteome identifier from UniProt, or TaxID. E.g.: 'UP000005640' for Homo sapiens; 'UP000000803' for Drosophila melanogaster

Full set of polyXY regions found in all complete reference proteomes from UniProtKB release 2021_04 (32,807,774 polyXY regions).

B) Calculate from scratch


1 - File with protein sequence/s
2 - Unique proteome identifier from UniProt.
E.g.: 'UP000005640' for Homo sapiens; 'UP000000803' for Drosophila melanogaster.
3 - Paste here your sequence/s
Minimum length of the polyXY
Mandatory amino acid 1 in the resulting polyXY
Mandatory amino acid 2 in the resulting polyXY
 Direpeats ('XYXYXY')
PolyXY category Joined ('XXXYYY')
 Shuffled ('XYYXYX')
Minimum number of repeated units (if Direpeats)
Maximum number of interruptions to merge polyXY

Additionally, you can download here the script to run it locally.