Skip to main content

Web Content Display Web Content Display

Skip banner

Web Content Display Web Content Display

Bioinformatics

Linear polymers in living organisms store biological information which is, in fact, independent of its carrier. This is a complex system where the information flow across transcription, translation and other processes related to signal transmission are is realised by appropriate coding systems and control mechanisms.

information flow from DNA to protein

DNA, RNA or protein sequences are made of a limited number of chemical molecules. In the simple approach, 4 or 20 residues are considered there. Naturally, their physicochemical properties are essential for proper structure formation and functioning. However, it is not limited to only these features, higher order and more abstract properties should be included, and it is still an ongoing question.

k-mer distributionThe example of analysis is based on the extraction of all k-mer, i.e. subsequences of length k. The frequencies of each k-mer allow the sequences to be presented in vector form. The application of the selected metric will assess the similarity between the sequences.

choas game representationChaos game representation (CGR). CGR is a method for transforming a nucleotide sequence (letters) into a series (numerical). Nucleotides are placed on a plane in the corners of the unit square,. An example representation is 𝐴(−1, −1), 𝐶(−1, 1), 𝑇 (1, 1) and 𝐺(1, −1). We start from the origin 𝑁0 = (0, 0) and the next point is in the half between a previous one and subsequent nucleotide within a sequence.

comparision of chaos game and embedding The authors intended to employ the embedding concept for amino acid representation. There words are represented as vectors, and the main feature of such an approach is that similar words have similar vectors measured by cosine similarity or Euclidean distance. Usually, skip-gram or continuous bag of words are applied for this purpose. According to the second approach, words that are placed nearby a word may delineate this word. Decomposition of a sentence into words, and then into a series of embeddings enable to analyse it by Recurrent Neural Networks (RNN) or, lastly, more popular Transformers

RNA structure prediction

In some cases, tRNA structure representation chaos game methodhe RNA structure can be improved based on the structure of the target product. In this approach, the similarity of the structure of the transcripts that encode homologous proteins is assumed. Under certain assumptions, the CGR may also represent elements of the RNA secondary structure in addition to the sequence.

RNA interference

RNA Interference modelStudy of the qualitative control of transcripts in the process of RNA interference and miRNA inner functional division. miRNA interference inhibits the translation stage by silencing mRNA molecules. The general approach assumes the inaccuracy and limitations of transcription factors, especially regarding transcripts supply and demand. In some cases, the mechanism of RNA interference dominates, regardless of whether a pathogenic factor causes transcripts overproduction or originates from the earlier imperfect gene regulation stages. The lack of clarity of RNAi is connected with the low complementarity of hybridizing molecules miRNA/mRNA in animals.