Project details:
A branch of DNA is a finite sequence of nucleotides adenine (A), cytosine (C), guanine (G) and thymine (T). For this purpose, let us consider the alphabet A = {A, C, G, T}. In this exercise we intend to implement an efficient search mechanisms for a regular pattern in branches of DNA. By default we mean a disposition of nucleotides (i.e. charactersA, C, G, T) that can be expressed by a regular expression onA.
The purpose of this exercise is, given a regular expression on the alphabetA, and given a branch of DNA, to decide whether a sequence of nucleotides that resembles the pattern in the given branch of DNA is efficient, of course.
For this purpose, a set of functions is provided, in particular the regex function that translates a string into a regular expression. These functions must be copied in full in the header of your solution
It is also recommended to read the article:[login to view URL]
[login to view URL]~rsc/regexp/[login to view URL]
[login to view URL]
Input:
The entry consists of two lines. The first line is the default for searching in the form of a string. The syntax of this pattern is the concrete syntax expected from a regular expression and is expected to read that string with the provided regex function.
The next line contains the string that defines branches of DNA (exclusively constituted '’A’, ’C’, ’G’ and ’T’ characters).
Output:
The output is organized in a single line containing the word YES if the branch contains a nucleotide sequence that meets the pattern provided. The line contains NO in the negative case
Limits:
The pattern has a maximum length of 100, and is properly processed by the regex function. The DNA branch has a non-zero positive length and is at most 5,000.
Sample Input 1
(TAG+TC)(A+C+G+T)*TGC
ATTGCAGTAGGACTCGCCTGATGCAGTC
Sample Output 1
YES
The provided functions are in the [login to view URL], it is needed to copy all from that file to the solution file.