Since the TCGA clinical data were collected in a hug unfriendly and unreadable xml files, it's become a hug problem for us to properly dig deep into the TCGA potency. The TCGA XML files contained all kind of information, by it cannot be properly used due to their tree like structures. One person can have multiple drugs records in the same time, and of course got multiple follow up data, not surprised. So this package not only provide the tools for integrating the clinical data, but also write down some through about the cancer and data mining.
library(devtools)
install_github("FanZhang9/TCGAmc")
This is a basic example which shows you how to solve a common problem:
library(TCGAcm)
## basic example code