CRANstatus lifecycle

Alt

Overview

Genekitr is a gene analysis toolkit based on R.

Five core features:

Supported organisms:

For more details, please refer to this site.

🛠 Installation

New features are available for version > 1.0.0

Install stable version from CRAN:

install.packages("genekitr")

Install development version from GitHub:

remotes::install_github("GangLiLab/genekitr")

Install development version from Gitee (for CHN mainland users):

remotes::install_git("https://gitee.com/genekitr/pacakge_genekitr")

📚 Vignette

https://www.genekitr.fun/

🧙🏻‍♂️ Tell a story ~ why develop genekitr?

Genes are the basic omics research unit, just like cells in our body.

However, the issue of the gene is a little tedious.

Here, I want to tell you a story about Mr. Doodle, a computational biology student. Now let us welcome our host Mr.Doodle to introduce his daily work with PI…

Scene 1: repeat work

PI gave Doodle 30 genes and let him check their locations (better with sequences) and exact names. Doodle searched on NCBI one by one and copied & paste it into excel. Doodle sent the file to PI one hour later, and PI smiled, “Well done! Now I have another 50!”

Doodle wonders how to avoid this repeat searching work?

Scene 2: embarrassing name

PI gave Doodle a DEG (differential expression analysis) matrix and a target gene list file. PI let him find if the target gene is up-regulated after treatment. After a while, Doodle found no PDL1 gene in the matrix but indeed exists in the gene list. “Do we have PDL1 gene?” he asked PI, and PI smiled, “Of course! You need to check gene CD274 instead of PDL1, which is an alias!”

Doodle was confused: how to distinguish between a real gene name and an alias?

Scene 3: outdated database

Doodle got the up-regulated gene symbols of the last DEG matrix to analyze KEGG. KEGG only supports Entrez id, so he needs to convert the symbol to Entrez. He found some symbols do not match Entrez id, but NCBI has. Doodle remembered he used org.db v3.12, but the current is v3.15. After he updated the annotation package, he finally got all matched IDs.

Doodle wonders if there is any method to help him get updated results instead of self-check every time?

Scene 4: imcompatible format

PI did enrichment analysis alone on the GeneOntology website and let Doodle do visualization according to that result. “Could you please help plot the pathway bubble plot? Meanwhile, I want to show the x-axis as FoldEnrichment?” PI smiled. Doodle wanted to use the clusterProfiler R package for the plot, but he found it only accepts its object. So he bites the bullet and self-coding using ggplot2.

Doodle wonders why does not have a tool that supports standard enrichment data frames?

Scene 5: annoying plot theme

Doodle finished the bubble plot at last and sent it to PI. After 15 minutes, PI sent him a message with a smile: “seems plot text size is too small, and could you give me a white background with border size 4 pt?” Doodle adjusted the ggplot theme function and modified 10 minutes. After a while, PI sent a message again, “I saw the second version; maybe the border is too thick. Could you replot?”

Doodle wonder if there is a function that could help him process the plot theme instead of changing the current code again and again?

Scene 6: limited plot types

Once Doodle got the GO enrichment analysis result, PI let him think about how to show them nicely. Doodle found that every tool has its specific plot. For example, WEGO could compare BP, CC, and MF terms; GOplot has a chord plot to show the relationship of gene and GO terms; clusterProfiler support enriched map and network, which could explore the relationship among enrich terms. One big problem is that their input data is not compatible, so it is inconvenient to plot WEGO plots using clusterProfiler objects.

Doodle wonder if there is any method that could involve beautiful plots from different tools with one universal data format?

Scene 7: chaotic export files

Doodle has finished differential expression analysis and GO/KEGG enrichment analysis; PI let him send all result files to him. Doodle firstly saved all results into three excel files and named “DEG_data.xlsx,” “GO_enrich.xlsx,” and “KEGG_enrich.xlsx” then, he packed three files into one zipped folder and named them the date, finally, he sent to PI. After a while, PI sent him a message: “Could you put all three results into one excel file?”

Doodle wonders if there is a way to save all data into one file without much manual operation?

If you have ever had one or more similar problems like Mr. Doodle, try genekitr !

✍️ Author

Yunze Liu

🔖 Citation

Wait to update…

💓 Welcome to contribute

If you are interested in genekitr, welcome contribute your ideas as follows: