This pipeline was built in order to obtain an interactive web-page from a ROSETTA or line-by-line formatted rule file using the Circos software. It is designed as a web-based interface running under an Apache server and using the Perl and Bash programming languages and SQLite. The web-page was written in PHP and HTML.
To run the pipeline, the user has to submit a file either in the ROSETTA export format or in a line-by-line format. The pipeline will run with the default settings and several optinal parameters are available. A complete overview of the parameters is shown below.
Parameter | Type | Range | Description |
---|---|---|---|
Rule file | File | Local path to the rule file. | |
Threshold | Integer | 0-99 | If checked, connections below the percentile threshold value are not shown in the circle. |
Rule format | Nominal | The format of the rule file. | |
String | A link to the results will be sent to this e-mail address when the job is finished. | ||
Minimum accuracy | Integer | 0-99 | If checked, rules with accuracy below the threshold are not considered. |
Minimum support | Integer | ≥0 | If checked, rules with support below the threshold are not considered. |
Groups | File | Local path to a group file. Used to defined nodes with similar color. | |
Colors | File | Local path to a color file. Used to defined colors of the nodes. |
A list of all output files is given in the section below. The interactive web pages may be used to investigate the rules. Two interactivities are defined: The user may retrieve the attribute names by hovering the pointer over the nodes in the figure. Furthermore, it is possible to retrieve a list of the rules that contain two conditions by clicking on the edge connecting those.
By default, the nodes (rule conditions) are grouped by the attribute name. A color for each group is generated automatically. Files that specify the groups (crv_groups.conf) and colors (crv_colors.conf) are generated by Ciruvis and can be downloaded for later use.
We now provide a general step-by-step description of the pipeline. We only describe the operations that are performed on the server to obtain a functional web page.
Starting from a rule file :
Two input formats are currently supported. Contact us if you need support for additional formats.
% Rules/patterns generated by ROSETTA. % Exported 2012.01.04 18:44:15 by . % % Rules % 68 rules. petal_width([*, 0.8)) => class(setosa) Supp. (LHS) = [47 object(s)] Supp. (RHS) = [47 object(s)] Acc. (RHS) = [1] Cov. (LHS) = [0.348148] Cov. (RHS) = [1] Stab. (LHS) = [1] Stab. (RHS) = [1] sepal_length([*, 5.9)) AND sepal_width([3.5, 3.9)) => class(setosa) Supp. (LHS) = [15 object(s)] Supp. (RHS) = [15 object(s)] Acc. (RHS) = [1] Cov. (LHS) = [0.111111] Cov. (RHS) = [0.319149] Stab. (LHS) = [1] Stab. (RHS) = [1]
Rules may be submitted in a plain text tab-separated format using the following column structure:
Column 1: The left-hand side (LHS) of the rule expressed as a comma-separated list of the rule conditions, e.g. "Attribute1=value1,Attribute2=value2,Attribute3=value3".
Column 2: The right-hand side (RHS) of the rule expressed as the value of the decision attribute.
Column 3: Rule accuracy, defined as P(RHS|LHS).
Column 4: Rule support, defined as P(LHS)*N, where N is the number of objects in the data set.
Any addtional columns are ignored.
petal_width=*-0.8 setosa 0.348148 47 sepal_length=*-5.9,sepal_width=3.5-3.9 setosa 1 15
The crv_colors.conf shows the colors for each group of conditions. The colors are specified in the r,g,b format with one color at each line. The groups are numbered starting from 0, which implies that line x is assumed to define the color of group x-1. If there are more groups that colors specified, the remaining groups will be colored in gray.
If crv_colors.conf is not submitted by the user, x+1 colors will be generated automatically, where x is the highest group number in crv_groups.conf.
255,51,51 255,173,51 214,255,51 92,255,51 51,255,133 51,255,255 51,133,255 92,51,255 214,51,255 255,51,173
The crv_groups.conf contain a list of all conditions and the group to which they belong (starting from number 0). The condition and the group number is separated by a tab. Multiple conditions may belong to the same group. Conditions that are not specified in this file will be colored in gray.
If crv_groups.conf is not submitted by the user, a file will be generated that group all conditions with the same attribute to the same group.
C0_0=0 0 C0_0=1 0 C1_4=0 1 C1_4=1 1 C2_8=0 2 C2_8=1 2 C3_11=0 3 C3_11=1 3 C4_15=0 4 C4_15=1 4 R0_0=0 5 R0_0=1 5 R1_21=0 6 R1_21=1 6 R2_43=0 7 R2_43=1 7 R3_64=0 8 R3_64=1 8 R4_85=0 9 R4_85=1 9 S0_0=0 5 S0_0=1 5 S1_21=0 6 S1_21=1 6 S2_43=0 7 S2_43=1 7 S3_64=0 8 S3_64=1 8 S4_85=0 9 S4_85=1 9