8000 Add Rdata manipulation tools by Alanamosse · Pull Request #7 · galaxyecology/tools-ecology · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Rdata manipulation tools #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions tools/Rdata-reader/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
categories:
- Ecology
description: Rtools that allow to load and export data from a binary Rdata file.
long_description: |
The two R scripts are rdata_reader and rdata_parser. The first loads an Rdata file and gives a quick insight of its content. Then the parser tools try to extract and write chosen attributes in the most appropriate way. A specified option will bind the available data into a single table file when possible.

homepage_url: https://github.com/galaxyecology/tools-ecology/tools/Rdata
name: rdata-reader
owner: ecology
remote_repository_url: https://github.com/galaxyecology/tools-ecology/tools/Rdata
type: unrestricted

repositories:
rdata_reader:
description: The rdata reader tool loads a rdata binary file, returns its attributes and a general summary.
owner: ecology
include:
- rdata_reader.xml
- rdata_reader.R
- rdata_macros.xml
- test-data/rdata_test.Rdata
- test-data/attributes.tabular
- test-data/summary.tabular

rdata_parser:
description: The rdata parser tool extracts and writes selected attributes from a binary Rdata file.
owner: ecology
include:
- rdata_parser.xml
- rdata_parser.R
- rdata_macros.xml
- test-data/rdata_test.Rdata
- test-data/attributes.tabular
11 changes: 11 additions & 0 deletions tools/Rdata-reader/rdata_macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<macros>
<token name="@VERSION@">0.1.0</token>
<xml name="rdata_requirements">
<requirements>
<requirement type="package" version="1.20.2">r-getopt</requirement>
</requirements>
</xml>
<xml name="rdata_input1">
<param format="rdata" name="input1" type="data" label="Rdata binary file to explore"/>
</xml>i
</macros>
86 changes: 86 additions & 0 deletions tools/Rdata-reader/rdata_parser.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
#!/usr/bin/env Rscript
#Use a Rdata file and attributes to extract
#Get every argument and write a file with its values(s)

cat("Load rdata file\n")

#get the rdata file
args = commandArgs(trailingOnly=TRUE)
rdata<-load(args[1])
rdata<-get(rdata)
#sum<-summary(rdata)

#get the selected attributes to explore
attributes_selected <- commandArgs(trailingOnly=TRUE)[2]
attributes<-strsplit(attributes_selected, ",") #List of elements

#write.table(sum,file = "summary.tabular",sep='\t',row.names=FALSE)
len<-length(attributes[[1]])
bind<-tail(args,n=1)

#file type definition
file_ext<-function(ext,attribute){
file<-paste("attribute_",attribute,ext,sep="") #Filename definition
file<-paste("outputs/",file,sep="")
return(file)
}

cat("Write element(s) : ")
for (i in 1:len){
attribute<-attributes[[1]][i] #Get the attribute i
if(! any(names(rdata)==attribute)){
error<-paste(attribute, " doesn't exist in the RData. Check the inputs files")
write(error, stderr())
}

attribute_val<-eval(parse(text=paste("rdata$",attribute,sep=""))) #Extract the value(s)

if(is.null(attribute_val)){ #Galaxy can't produce output if NULL
file<-file_ext(".txt",attribute)
cat(paste(attribute,", ",sep=""))
write("Return NULL value",file=file)
next #Exit loop
}

if (typeof(attribute_val)=="list"){ #Need to be corrected, fail in galaxy but not in R
if(length(attribute_val)=="0"){
file<-file_ext(".txt",attribute)
sink(file=file)
print("Empty list :") #If the list is empty without element, file is empty and an error occur in galaxy
print(attribute_val)
sink()
next
}else{
attribute_val<-as.data.frame(do.call(rbind, attribute_val))
file<-file_ext(".tabular",attribute)
cat(paste(attribute,", ",sep=""))
write.table(attribute_val,file=file,row.names=FALSE)
next
}
}else if (typeof(attribute_val)=="language"){ #OK
attribute_val<-toString(attribute_val,width = NULL)
file<-file_ext(".txt",attribute)
cat(paste(attribute,", ",sep=""))
write(attribute_val,file=file)
next
}
file<-file_ext(".tabular",attribute)
dataframe<-as.data.frame(attribute_val)
names(dataframe)<-attribute
if(bind=="nobind"){
cat(paste(attribute,", ",sep=""))
write.table(dataframe,file=file,row.names=FALSE,sep=" ")
}else{
cat(paste(attribute,", ",sep=""))
if(!exists("alldataframe")){
alldataframe<-dataframe
}else{
alldataframe<-cbind(alldataframe, dataframe)
}
}
}

if(exists("alldataframe")&&bind=="bind"){
write.table(alldataframe,file="outputs/all_attributes.tabular",row.names=FALSE,sep=" ")
}
q('no')
75 changes: 75 additions & 0 deletions tools/Rdata-reader/rdata_parser.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
<tool id="rdata_parser" name="Rdata binary file parser" version="@VERSION@">
<macros>
<import>rdata_macros.xml</import>
</macros>
<expand macro="rdata_requirements" />
<command detect_errors="exit_code"><![CDATA[
mkdir outputs &&
Rscript '$__tool_directory__/rdata_parser.R' '$input1' $selected_attributes '$other' '$bind_dataframe'
]]>
</command>
<inputs>
<expand macro="rdata_input1"/>
<param name="rdata_attributes" type="data" label="File with .Rdata content details" format="tabular" help="Tabular file with Rdata attributes list. It can come from the Rdata reader tool : 'List of attributes from RDATAFILE'"/>
<param name="selected_attributes" label="Select which attribute(s) you want to extract" type="select" optional="false" multiple="true">
<options from_dataset="rdata_attributes">
<column name="value" index="0"/>
</options>
</param>
<param name="bind_dataframe" label="Bind attributes in a single tabular when its possible" type="boolean" truevalue="bind" falsevalue="nobind" checked="true"/>
</inputs>
<outputs>
<data name="other">
<discover_datasets pattern="__designation_and_ext__" visible="true" directory="outputs" />
</data>
</outputs>
<tests>
<test>
<param name="input1" value="rdata_test.Rdata"/>
<param name="rdata_attributes" value="attributes.tabular"/>
<param name="selected_attributes" value="ID,Age,Name"/>
<param name="bind_dataframe" value="TRUE"/>
<output name="other">
<discovered_dataset designation="all_attributes">
<assert_contents>
<has_line line="&quot;ID&quot;&#009;&quot;Age&quot;&#009;&quot;Name&quot;"/>
</assert_contents>
</discovered_dataset>
</output>
</test>
</tests>
<help><![CDATA[

==========================
Rdata parser
==========================
**What it does**


The Rdata parser tool allows to extract informations from a file in .RData format.

|

**How to use it**

First use the Rdata reader tool to get the list of attributes available in the binary file.

Use the reader tool output to select the attribute(s) you want to extract.

If the selected variables can be displayed alongside in a tabular files the option \"Bind\" will attempt to do so.

|

**Outputs**

The tool will produce one file per attribute selected unless the \"Bind\" option is set to \"Yes\".

If the data can be displayed as a table the file will be a .tabular.

|

**More informations**

More informations concerning R Data format and save / load functions can be found here: http://www.sthda.com/english/wiki/saving-data-into-r-data-format-rds-and-rdata.
]]></help>
</tool>
12 changes: 12 additions & 0 deletions 6D47 tools/Rdata-reader/rdata_reader.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env Rscript
#Return list of attributes from a Rdata file

args = commandArgs(trailingOnly=TRUE)
rda<-load(args[1]) #Load the rdata
rdata<-get(rda)
names<-names(rdata) #Get the attributes
write(names, file = "rdata_list_attr")

sum<-summary(rdata)
write.table(sum,file = "summary.tabular",sep='\t',row.names=FALSE)
#print(str(rdata)) #Other xay to give informations
53 changes: 53 additions & 0 deletions tools/Rdata-reader/rdata_reader.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
<tool id="rdata_reader" name="Rdata binary file reader" version="@VERSION@">
<macros>
<import>rdata_macros.xml</import>
</macros>
<expand macro="rdata_requirements" />
<command detect_errors="exit_code"><![CDATA[
Rscript '$__tool_directory__/rdata_reader.R' '$input1' '$rdata_list_attr' ]]>
</command>
<inputs>
<expand macro="rdata_input1"/>
</inputs>
<outputs>
<data format="tabular" name="rdata_list_attr" from_work_dir="rdata_list_attr" label="List of attributes from ${input1.name}"/>
<data name="summary" from_work_dir="summary.tabular" format="tabular" label="Summary of ${input1.name} attributes"/>
</outputs>
<tests>
<test>
<param name="input1" value="rdata_test.Rdata"/>
<output name="rdata_list_attr" file="attributes.tabular"/>
<output name="summary" file="summary.tabular"/>
</test>
</tests>
<help>
==========================
Rdata binary file reader
==========================
**What it does**

The Rdata reader tool gives informations on the content of a binary Rdata file.

|

**How to use it**

Select a file in Rdata format and execute the tool.

Use this tool before considering to use "Rdata binary file parser".

|

**Outputs**

A list of the available dimensions in the Rdata (tabular output file).

Summary file of the inner attributes and their values.

|

**More informations**

More informations concerning R Data format and save / load functions can be found here: http://www.sthda.com/english/wiki/saving-data-into-r-data-format-rds-and-rdata.
</help>
</tool>
3 changes: 3 additions & 0 deletions tools/Rdata-reader/test-data/attributes.tabular
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
ID
Age
Name
Binary file added tools/Rdata-reader/test-data/rdata_test.Rdata
Binary file not shown.
7 changes: 7 additions & 0 deletions tools/Rdata-reader/test-data/summary.tabular
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
" ID" " Age" " Name"
"Min. :1 " "Min. :12.0 " "Dora:1 "
"1st Qu.:2 " "1st Qu.:12.0 " "John:1 "
"Median :3 " "Median :15.0 " "name:3 "
"Mean :3 " "Mean :16.2 " NA
"3rd Qu.:4 " "3rd Qu.:21.0 " NA
"Max. :5 " "Max. :21.0 " NA
0