Skip to content

Commit 9ddf836

Browse files
committed
add readme
1 parent 49d1928 commit 9ddf836

1 file changed

Lines changed: 37 additions & 0 deletions

File tree

README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
**text2vec** is an R package which provides an efficient framework with a concise API for text analysis and natural language processing (NLP).
2+
3+
Goals which we aimed to achieve as a result of development of `text2vec`:
4+
5+
* **Concise** - expose as few functions as possible
6+
* **Consistent** - expose unified interfaces, no need to explore new interface for each task
7+
* **Flexible** - allow to easily solve complex tasks
8+
* **Fast** - maximize efficiency per single thread, transparently scale to multiple threads on multicore machines
9+
* **Memory efficient** - use streams and iterators, not keep data in RAM if possible
10+
11+
See [API](http://text2vec.org/api.html) section for details.
12+
13+
# Performance
14+
15+
![htop](http://text2vec.org/images/htop.png)
16+
17+
This package is efficient because it is carefully written in C++, which also means that text2vec is memory friendly. Some parts are fully parallelized using OpenMP.
18+
19+
Other emrassingly parallel tasks (such as vectorization) can use any fork-based parallel backend on UNIX-like machines. They can achieve near-linear scalability with the number of available cores.
20+
21+
Finally, a streaming API means that users do not have to load all the data into RAM.
22+
23+
24+
# Contributing
25+
26+
The package has [issue tracker on GitHub](https://github.com/dselivanov/text2vec/issues) where I'm filing feature requests and notes for future work. Any ideas are appreciated.
27+
28+
Contributors are welcome. You can help by:
29+
30+
- testing and leaving feedback on the [GitHub issuer tracker](https://github.com/dselivanov/text2vec/issues) (preferably) or directly by e-mail
31+
- forking and contributing (check [code our style guide](https://github.com/dselivanov/text2vec/wiki/Code-style-guide)). Vignettes, docs, tests, and use cases are very welcome
32+
- by giving me a star on [project page](https://github.com/dselivanov/text2vec) :-)
33+
34+
# License
35+
36+
GPL (>= 2)
37+

0 commit comments

Comments
 (0)