Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
273 commits
Select commit Hold shift + click to select a range
4e57b2c
Update constraints.md
dadishimwe Jun 2, 2025
6df4851
Update guide.md
dadishimwe Jun 2, 2025
2d5c7b2
Update guide.md
dadishimwe Jun 2, 2025
9589931
Update README.md
dadishimwe Jun 2, 2025
56eaf47
Update README.md
dadishimwe Jun 2, 2025
7f3be63
Update constraints.md
dadishimwe Jun 2, 2025
7ff367f
Update constraints.md
dadishimwe Jun 2, 2025
7e16553
Update README.md
dadishimwe Jun 2, 2025
8f631ff
Update constraints.md
dadishimwe Jun 2, 2025
845fee9
Update README.md
dadishimwe Jun 2, 2025
1fa4856
Update README.md
dadishimwe Jun 2, 2025
3bc126f
Update README.md
dadishimwe Jun 2, 2025
637124f
Update README.md
dadishimwe Jun 2, 2025
e37f4c3
Update README.md
dadishimwe Jun 2, 2025
e131c73
Update README.md
dadishimwe Jun 2, 2025
47d5cd3
Update README.md
dadishimwe Jun 2, 2025
f29e2f9
Update README.md
dadishimwe Jun 2, 2025
1f6474b
Update README.md
dadishimwe Jun 2, 2025
35abd6a
Update README.md
dadishimwe Jun 2, 2025
2beea82
Update README.md
dadishimwe Jun 2, 2025
b4e8ce4
Merge pull request #13 from dadishimwe/main
MadiMalik Jun 3, 2025
2ddf161
Merge branch 'main' into fix-starter-ci-errors
MadiMalik Jun 3, 2025
49b0386
fixed more issues
MadiMalik Jun 3, 2025
85e29b5
Merge branch 'fix-starter-ci-errors' of https://github.com/MIT-Emergi…
MadiMalik Jun 3, 2025
eb20413
fix some errors
MadiMalik Jun 3, 2025
41a591b
Merge pull request #12 from MIT-Emerging-Talent/fix-starter-ci-errors
MadiMalik Jun 3, 2025
159725c
Fix some errors
AhmedKhalifa7 Jun 3, 2025
b737e57
fix ci errors README.md
AhmedKhalifa7 Jun 4, 2025
2b95b56
fixing ci errors
AhmedKhalifa7 Jun 4, 2025
bfd94e7
Merge remote-tracking branch 'origin/main' into doc/main-readme
AhmedKhalifa7 Jun 4, 2025
6b06117
fixing ci errors
AhmedKhalifa7 Jun 4, 2025
c846960
Merge pull request #14 from MIT-Emerging-Talent/doc/main-readme
MadiMalik Jun 4, 2025
e367a33
Update README.md with correct clone link
NoorelsalamAlmakki Jun 5, 2025
88c4266
Merge pull request #15 from MIT-Emerging-Talent/doc/main-readme
MadiMalik Jun 6, 2025
26a1aa6
Refined the structure of the '0_domain_study' folder
Jun 12, 2025
3d4660e
Fixed main README.md linting errors
NoorelsalamAlmakki Jun 12, 2025
89c499a
Merge pull request #20 from MIT-Emerging-Talent/fix/main-readme-ci-ch…
MadiMalik Jun 12, 2025
064b587
Merge pull request #19 from MIT-Emerging-Talent/enhancement/research-…
MadiMalik Jun 12, 2025
2ffb7e9
upstream
dadishimwe Jun 13, 2025
b45a804
Update communication.md
dadishimwe Jun 13, 2025
9915241
Update communication.md
dadishimwe Jun 13, 2025
de19826
Update communication.md
dadishimwe Jun 13, 2025
01c056d
Update communication.md
dadishimwe Jun 13, 2025
1da488c
add intial research questions
MadiMalik Jun 13, 2025
4f38e06
Merge pull request #23 from MIT-Emerging-Talent/sync-upstream
MadiMalik Jun 13, 2025
f82a117
Merge pull request #26 from MIT-Emerging-Talent/initial-research-ques…
dadishimwe Jun 13, 2025
08887a8
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
a696cc3
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
454eb01
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
9c79a20
Fix Ci checks
AlhassenSabeeh Jun 15, 2025
9a6b1c2
Complete Milestone 1 deliverables
MadiMalik Jun 15, 2025
210c57e
Add final questions
dadishimwe Jun 15, 2025
3c462c4
Update final_questions.md
dadishimwe Jun 15, 2025
0e0c570
Update final_questions.md
dadishimwe Jun 15, 2025
68ba99c
Update final_questions.md
dadishimwe Jun 15, 2025
d1764b9
Update final_questions.md
dadishimwe Jun 15, 2025
05eac14
Update final_questions.md
dadishimwe Jun 15, 2025
6cf4136
Update final_questions.md
dadishimwe Jun 15, 2025
1026407
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
980c881
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
8e97bfd
Add first draft of BNPL risk prediction sources
AlhassenSabeeh Jun 15, 2025
bf601ec
rewrite README.md in /0_domain_study
AhmedKhalifa7 Jun 15, 2025
823da75
rewrite /0_domian_study/README.md
AhmedKhalifa7 Jun 15, 2025
12f4629
Update README.md
AhmedKhalifa7 Jun 15, 2025
920d55f
Merge pull request #37 from MIT-Emerging-Talent/main
dadishimwe Jun 15, 2025
47cc071
Update README.md
dadishimwe Jun 15, 2025
809f9c1
Update README.md
dadishimwe Jun 15, 2025
9434cc8
Merge pull request #28 from MIT-Emerging-Talent/add-guide-md-domain-s…
dadishimwe Jun 15, 2025
63e459b
Update README.md
dadishimwe Jun 15, 2025
f00c8db
Update final_questions.md
dadishimwe Jun 15, 2025
9a5b4b7
Update README.md
dadishimwe Jun 15, 2025
a90447c
Merge pull request #35 from MIT-Emerging-Talent/0_domain_study/README.md
dadishimwe Jun 15, 2025
c4bc65a
Merge pull request #34 from MIT-Emerging-Talent/sources
dadishimwe Jun 15, 2025
66134a8
updated 1_problem_identification.md to resolve #32
MadiMalik Jun 15, 2025
b117dd9
Merge branch 'main' into doc/updated-1_problem_identification.md
MadiMalik Jun 15, 2025
acecd59
Merge pull request #41 from MIT-Emerging-Talent/doc/updated-1_problem…
dadishimwe Jun 15, 2025
a58131f
Update README.md
MadiMalik Jun 15, 2025
dbb8859
Merge pull request #38 from MIT-Emerging-Talent/constraints
MadiMalik Jun 15, 2025
f521817
updated 0_domain_study/README.md to resolve #42
MadiMalik Jun 15, 2025
73fe5ee
Merge pull request #43 from MIT-Emerging-Talent/doc/update-0_domain_s…
dadishimwe Jun 15, 2025
e49713f
Merge branch 'main' into main
MadiMalik Jun 15, 2025
2dbd7fb
Merge pull request #29 from dadishimwe/main
MadiMalik Jun 15, 2025
5ee25ea
Add more sources supporting system thinking approach
dadishimwe Jun 15, 2025
e4120e4
Modified the research questions
Jun 16, 2025
7fe2c17
Added the CONTRIBUTING.md file
Jun 16, 2025
4cc662c
Update sources.md
dadishimwe Jun 16, 2025
1276ad0
Merge pull request #48 from MIT-Emerging-Talent/update-0_domain_study…
AhmedKhalifa7 Jun 16, 2025
a775919
Merge pull request #44 from MIT-Emerging-Talent/doc/main-readme-adjus…
dadishimwe Jun 16, 2025
60c344f
Merge pull request #46 from MIT-Emerging-Talent/doc/contributing-file
dadishimwe Jun 16, 2025
d935433
added raw datasets for our research question to solve #51
MadiMalik Jun 28, 2025
91dfc51
deleted a file to solve #51
MadiMalik Jun 28, 2025
52c548c
Merge pull request #52 from MIT-Emerging-Talent/raw_data
dadishimwe Jun 28, 2025
4ba71a0
Changes
dadishimwe Jun 28, 2025
13b27b6
Update README.md
dadishimwe Jun 28, 2025
915604c
cleaned 1_datasets/raw_data/afdr_a8.csv and saved the cleaned version…
MadiMalik Jun 28, 2025
860ca08
cleaned 1_datasets/raw_data/afdr_a8.csv and saved the cleaned version…
MadiMalik Jun 28, 2025
864f272
Merge pull request #55 from MIT-Emerging-Talent/Update_1_datasets/Rea…
MadiMalik Jun 28, 2025
09968c0
cleaned 1_datasets/raw_data/afdr_charts.csv and saved the cleaned ver…
MadiMalik Jun 28, 2025
5653fd2
Merge branch 'main' into data_preparation
MadiMalik Jun 28, 2025
944f185
Merge pull request #56 from MIT-Emerging-Talent/data_preparation
MadiMalik Jun 28, 2025
d2a6e62
added .gitignore to hide .DS_Store folder
MadiMalik Jun 28, 2025
22a6a37
2 BNPL files data cleaned
MyatCharm Jun 29, 2025
0ef17ee
Merge branch 'data_preparation' of https://github.com/MIT-Emerging-Ta…
MyatCharm Jun 29, 2025
a72cc87
Merge branch 'main' into data_preparation
MyatCharm Jun 29, 2025
9312cb9
ruff fixed
MyatCharm Jun 29, 2025
f0db820
Update 03_clean_BNPL.ipynb
MadiMalik Jun 29, 2025
abec92f
changed the Loan_Characteristic column to it's original values
MadiMalik Jun 29, 2025
e7203ed
added 1_datasets/reference/afdr_charts_cleaned.csv
MadiMalik Jun 29, 2025
66ec739
Merge pull request #59 from MIT-Emerging-Talent/data_preparation_v3
MadiMalik Jun 29, 2025
7daaa41
Relocate BNPL_intention_to_use_cleaned.csv to processed_datasets for …
AlhassenSabeeh Jun 29, 2025
caa6478
Merge pull request #60 from MIT-Emerging-Talent/relocate/bnpl-intenti…
MadiMalik Jun 29, 2025
a6aaf73
cleaned 1_datasets/raw_data/FRBNY-SCE-Credit-Access-complete_microdat…
MadiMalik Jun 30, 2025
1f39a62
Merge branch 'main' into data_preparation_v3
MadiMalik Jun 30, 2025
d3b3da3
Merge pull request #61 from MIT-Emerging-Talent/data_preparation_v3
dadishimwe Jun 30, 2025
bb490cd
docs: add first draft of milestone 2 retrospective
AlhassenSabeeh Jun 30, 2025
ae69c81
added new reserch question and describe how we want to model it to so…
MadiMalik Jun 30, 2025
ecbb0a2
Update README.md
dadishimwe Jun 30, 2025
10b3542
Update README.md
dadishimwe Jun 30, 2025
d4a577b
Merge pull request #64 from MIT-Emerging-Talent/update-main-read
NoorelsalamAlmakki Jun 30, 2025
5d662ad
Organized and fixed the folder '02_data_preparation'
Jun 30, 2025
0ca62c6
add updated 2_data_preparation/README.md
MadiMalik Jun 30, 2025
aa32167
update to solve #65
MadiMalik Jun 30, 2025
bb97352
fix github-action-markdown-cli@v3.3.0 to solve #65
MadiMalik Jun 30, 2025
8425ec2
Merge branch 'main' into 2_data_preparation-README-update
MadiMalik Jun 30, 2025
c809ff9
Merge pull request #66 from MIT-Emerging-Talent/2_data_preparation-RE…
dadishimwe Jun 30, 2025
5b5eb31
Merge pull request #62 from MIT-Emerging-Talent/retrospectives/2_data…
dadishimwe Jun 30, 2025
8557354
Update README.md
dadishimwe Jun 30, 2025
7b53bb6
Apply ruff formatting
dadishimwe Jun 30, 2025
484069b
Merge branch 'main' into cleaning/processed-dataset-finalization
dadishimwe Jun 30, 2025
0187b1f
Update README.md
dadishimwe Jun 30, 2025
fc0806d
Merge pull request #68 from MIT-Emerging-Talent/cleaning/processed-da…
dadishimwe Jun 30, 2025
2d3274a
Created the visualization and EDA for the processed data
Jun 30, 2025
51da4ac
Rename files to follow snake_case convention, fixing markdown errors …
dadishimwe Jul 1, 2025
47aea9b
Fixing
dadishimwe Jul 1, 2025
36d27c1
Temporary rename for case-insensitive filesystem
dadishimwe Jul 1, 2025
b4635a4
Rename directories to snake_case format
dadishimwe Jul 1, 2025
4d0409c
Merge branch 'main' into feature/data-exploration
dadishimwe Jul 1, 2025
8a16325
Merge pull request #69 from MIT-Emerging-Talent/feature/data-exploration
dadishimwe Jul 1, 2025
8e7f8db
Update Charm's retro1_problem_identification.md
MyatCharm Jul 2, 2025
56351cf
Merge branch 'main' into retrospectives/2_data_collection
MyatCharm Jul 2, 2025
7ba9786
Merge pull request #70 from MIT-Emerging-Talent/retrospectives/2_data…
AlhassenSabeeh Jul 2, 2025
e9c0043
20Junmeetingminutes
MyatCharm Jul 5, 2025
607dd2b
snake_case
MyatCharm Jul 5, 2025
c0a1bfb
Merge branch 'main' into meeting_minutes
MyatCharm Jul 5, 2025
e185d2b
4july
MyatCharm Jul 5, 2025
b838810
changing folders
MyatCharm Jul 5, 2025
3ac26d7
added 5th july
MyatCharm Jul 5, 2025
64cc7e8
Update 20_jun_2025.md
dadishimwe Jul 5, 2025
885ace7
update file names
MyatCharm Jul 5, 2025
1b648bd
Merge branch 'meeting_minutes' of https://github.com/MIT-Emerging-Tal…
MyatCharm Jul 5, 2025
cdb605f
update 28th June
MyatCharm Jul 5, 2025
4ca097e
More updates on meeting minutes
MyatCharm Jul 6, 2025
d9d1b3b
Update july_4_2025.md
MyatCharm Jul 6, 2025
944ea0a
File names change
MyatCharm Jul 6, 2025
9d49cc1
Merge branch 'meeting_minutes' of https://github.com/MIT-Emerging-Tal…
MyatCharm Jul 6, 2025
86e72cc
update attendees
MyatCharm Jul 7, 2025
c770f9c
Added README.md for 1_datasets folder
Jul 7, 2025
e24dd45
Merge pull request #77 from MIT-Emerging-Talent/doc/dataset-Readme
MyatCharm Jul 8, 2025
ee32ef9
Add Loan Default Prediction and Loan Default Datasets
dadishimwe Jul 9, 2025
9b18766
Merge branch 'main' into meeting_minutes
dadishimwe Jul 9, 2025
ca79a1f
Merge pull request #72 from MIT-Emerging-Talent/meeting_minutes
dadishimwe Jul 9, 2025
32a88da
Merge branch 'main' into add-datasets
dadishimwe Jul 9, 2025
b787e2c
Merge pull request #78 from MIT-Emerging-Talent/add-datasets
MadiMalik Jul 10, 2025
02fe7bd
clean_public2024
MyatCharm Jul 11, 2025
8540d8f
-
MyatCharm Jul 11, 2025
f64abc6
Adding description for loan_default_dataset and loan_default_predicti…
dadishimwe Jul 11, 2025
9d27771
Merge pull request #82 from MIT-Emerging-Talent/doc/add-description-4…
MadiMalik Jul 12, 2025
853a9ff
Merge branch 'main' into clean_public2024
MadiMalik Jul 12, 2025
f11d21a
upload_accepted
MyatCharm Jul 13, 2025
a29be5b
oldies into a folder
MyatCharm Jul 13, 2025
81047b2
Merge pull request #81 from MIT-Emerging-Talent/clean_public2024
MadiMalik Jul 14, 2025
fc7faa4
Merge branch 'main' into uploading_new
MadiMalik Jul 14, 2025
e97beb1
Merge pull request #86 from MIT-Emerging-Talent/uploading_new
MadiMalik Jul 14, 2025
e8400fb
upload rejected and description
MyatCharm Jul 14, 2025
8afe440
new meeting minute
MyatCharm Jul 14, 2025
dbce545
Merge branch 'main' into meeting_minutes
MyatCharm Jul 14, 2025
602eaf3
restructured the raw_data file to solve #89
MadiMalik Jul 14, 2025
39fa72a
Merge pull request #88 from MIT-Emerging-Talent/meeting_minutes
MadiMalik Jul 14, 2025
a0cbee3
restructured the raw_data file to solve #89
MadiMalik Jul 14, 2025
90f0c48
restructured the raw_data file to solve #89
MadiMalik Jul 14, 2025
2d00491
Merge branch 'main' into update-raw_data_file
MyatCharm Jul 14, 2025
b7fc09c
Update README.md
MyatCharm Jul 14, 2025
7308146
try fixing linting
MyatCharm Jul 14, 2025
304df25
Merge branch 'main' into uploading_new
MyatCharm Jul 14, 2025
0d63701
Merge pull request #91 from MIT-Emerging-Talent/update-raw_data_file
MyatCharm Jul 14, 2025
d999436
fixed the error from 1_datasets/raw_data/new_datasets/README.md
MadiMalik Jul 14, 2025
fab569e
Merge pull request #92 from MIT-Emerging-Talent/fix-error-1_datasets/…
MyatCharm Jul 14, 2025
844ef1f
Merge branch 'main' into uploading_new
MadiMalik Jul 14, 2025
0399a09
Merge pull request #87 from MIT-Emerging-Talent/uploading_new
MadiMalik Jul 14, 2025
04c89e9
move files inside
MyatCharm Jul 14, 2025
7fd690a
Merge branch 'main' into uploading_new
MyatCharm Jul 14, 2025
f816acb
Merge pull request #93 from MIT-Emerging-Talent/uploading_new
MadiMalik Jul 15, 2025
7399309
updated main README to solve #73
MadiMalik Jul 15, 2025
3b6f0f9
fixed error to solve #95
MadiMalik Jul 15, 2025
80b51a6
fixed error to solve #95
MadiMalik Jul 15, 2025
35fd161
fixed error to solve #95
MadiMalik Jul 15, 2025
e07ff7b
fixed error to solve #95
MadiMalik Jul 15, 2025
8c852bf
Fixed md errors and improved text readability
Jul 15, 2025
c3aa7f8
Fix the md line-length error in line 95
Jul 15, 2025
488d46c
Merge pull request #94 from MIT-Emerging-Talent/update-main-README-to…
dadishimwe Jul 16, 2025
80fec11
Sample_final_data_upload
MyatCharm Jul 16, 2025
6ccac9c
Update Meeting Minutes
MyatCharm Jul 17, 2025
8465b25
Merge branch 'main' into meeting_minutes
MyatCharm Jul 17, 2025
1b150ea
push_data_clean_file
MyatCharm Jul 18, 2025
6565527
Merge branch 'main' into p2p_cleaned
MyatCharm Jul 18, 2025
3e42030
ruff
MyatCharm Jul 18, 2025
fcfbb29
ruff fixed
MyatCharm Jul 18, 2025
3bcf796
Class balancing with class_weights
MyatCharm Jul 18, 2025
12e2209
ruff fixed
MyatCharm Jul 18, 2025
8d52bce
reuploading data do be shown in GitHub
AlhassenSabeeh Jul 18, 2025
46e0b89
Merge pull request #100 from MIT-Emerging-Talent/p2p_cleaned
MadiMalik Jul 18, 2025
c6c793c
Cleaned data prep and clean to solve #117
MadiMalik Jul 19, 2025
9523c7d
fixed error to solve #117
MadiMalik Jul 19, 2025
7fc9ac6
Compile version by dadi
MyatCharm Jul 19, 2025
068b155
snake case
MyatCharm Jul 19, 2025
b81a72b
snake_case
MyatCharm Jul 19, 2025
7cfe83f
Merge branch 'main' into p2p_cleaned
MyatCharm Jul 19, 2025
7417553
Merge pull request #120 from MIT-Emerging-Talent/clean-and-prepare-fresh
MyatCharm Jul 19, 2025
66b7d9a
Merge branch 'main' into p2p_cleaned
MyatCharm Jul 19, 2025
b1283d6
Merge branch 'main' into reuploading_data
MyatCharm Jul 19, 2025
bacf923
Merge pull request #105 from MIT-Emerging-Talent/reuploading_data
MyatCharm Jul 19, 2025
91bf31f
Merge pull request #99 from MIT-Emerging-Talent/meeting_minutes
AlhassenSabeeh Jul 19, 2025
eb20f9f
Merge pull request #104 from MIT-Emerging-Talent/modelling_p2p
AlhassenSabeeh Jul 19, 2025
2eb7be5
Merge pull request #121 from MIT-Emerging-Talent/p2p_cleaned
AlhassenSabeeh Jul 19, 2025
7b3a274
restructure exploration folder
AlhassenSabeeh Jul 19, 2025
e87685b
adding new folder
AlhassenSabeeh Jul 19, 2025
243ae1e
Merge pull request #125 from MIT-Emerging-Talent/restructure/3_data_e…
NoorelsalamAlmakki Jul 19, 2025
3cc28a7
restructure / 2_data_preparation
AlhassenSabeeh Jul 19, 2025
0f60e1f
Merge pull request #127 from MIT-Emerging-Talent/restructure/2_data_p…
NoorelsalamAlmakki Jul 19, 2025
30b598f
restructure / 2_data_preparation
AlhassenSabeeh Jul 19, 2025
f304b82
restructure / 1_datasets
AlhassenSabeeh Jul 19, 2025
784ed00
restructure / 1_datasets
AlhassenSabeeh Jul 19, 2025
1e0615f
removing not relevant datasests
AlhassenSabeeh Jul 19, 2025
a4d825c
Merge pull request #129 from MIT-Emerging-Talent/restructure/1_datasets
MadiMalik Jul 19, 2025
14a4f01
Added EDA for the new research question
Jul 19, 2025
344357b
restructer 0_domain_study folder
AlhassenSabeeh Jul 19, 2025
d29d18a
Merge pull request #133 from MIT-Emerging-Talent/restructure/0_domain…
NoorelsalamAlmakki Jul 19, 2025
670720b
ruff fix
AlhassenSabeeh Jul 19, 2025
3d22db8
Merge pull request #131 from MIT-Emerging-Talent/feature/data-explora…
AlhassenSabeeh Jul 19, 2025
0ec684d
Refined 0_domain_study's readmes
Jul 19, 2025
d19cc2b
fixed md errors
Jul 19, 2025
7ee450e
Merge branch 'main' into documentation/domain-study-new-RQ
MyatCharm Jul 20, 2025
570a47c
Merge pull request #134 from MIT-Emerging-Talent/documentation/domain…
MyatCharm Jul 20, 2025
6a41a8d
quick fixing / touch up
MyatCharm Jul 20, 2025
e8edc87
Save model
MyatCharm Jul 20, 2025
20cfe3e
Save Model
MyatCharm Jul 20, 2025
9dd3cbf
Merge pull request #135 from MIT-Emerging-Talent/quick_fix_model_file
dadishimwe Jul 20, 2025
5ea7463
final nalysis script
MadiMalik Jul 20, 2025
812ccba
Merge branch 'main' into modeling
MadiMalik Jul 20, 2025
1107c6c
fixing some checks
dadishimwe Jul 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ venv/
*.db
*.idea
*.ruff_cache
.DS_Store
107 changes: 106 additions & 1 deletion 0_domain_study/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,106 @@
# Domain Research
# Domain Study: Peer-to-Peer (P2P) Lending Risk

This document outlines our foundational research into the domain of Peer-to-Peer
(P2P) lending and the critical challenge of modeling loan default risk. It also
provides an overview of the contents of this folder.

## Folder Contents

This `0_domain_study` folder is organized to document our research process and
findings.

- **`README.md` (This File):** Provides a high-level summary of our research
domain, the evolution of our research question, and the overall folder
structure.
- **`guide.md`:** Offers a detailed orientation to the folder's contents,
explaining the purpose of each file and sub-directory.
- **`research_questions/`:** Contains the history of our research questions.
- `old/`: Documents our initial research on "Buy Now, Pay Later" (BNPL).
- `new/`: Details our current, refined research questions on P2P lending.
- **`sources/`:** Lists the data and literature that support our research.
- `old/`: Contains sources related to our initial BNPL research.
- `new/`: Contains sources for our current P2P lending research.

## Evolution of Our Research

Our initial research focused on the risks associated with "Buy Now, Pay Later"
(BNPL) services. However, due to the limited availability of public datasets
required to rigorously investigate our initial questions, we pivoted our
research.

Our new focus is on **Peer-to-Peer (P2P) lending**, a domain with rich, publicly
available data that allows for robust modeling and analysis of credit risk.

## The P2P Lending Landscape

Peer-to-Peer (P2P) lending platforms have emerged as a significant alternative
to traditional banking, connecting individual borrowers with investors directly.
These platforms offer greater access to credit for borrowers and potentially
higher returns for investors. However, the decentralized nature of P2P lending
introduces unique challenges in assessing borrower creditworthiness and
predicting the likelihood of default, which is crucial for sustainable growth
and investor confidence.

## Problem Statement

While P2P platforms provide extensive data on loans and borrowers, accurately
predicting which loans will default remains a complex problem. Investors face
the risk of capital loss due to insufficient or ineffective risk assessment
models. Therefore, there is a critical need to develop robust predictive models
that can identify the key drivers of default risk, enabling investors to make
more informed decisions and platforms to refine their underwriting standards.

## Research Question

_What are the key borrower and loan characteristics that best predict default
risk in peer-to-peer (P2P) lending platforms in the United States?_

### Secondary Questions

1. Which machine learning approaches most accurately model default risk in P2P
lending data?
2. How do features such as credit grade, interest rate, debt-to-income ratio,
income, loan term, and loan purpose contribute to risk prediction?
3. How do default risk patterns change across time, different regions, or
borrower segments?
4. In what ways can advanced risk modeling support investor decisions and
improve P2P platform underwriting?

## Key Focus Areas

Our research is structured around three core areas:

- **Technical Focus**: We will concentrate on advanced data cleaning, feature
engineering, and benchmarking machine learning models (Logistic Regression,
Random Forest, XGBoost). We will also use explainability techniques like SHAP
to interpret model predictions.
- **Business Focus**: The insights from our models will be framed to improve
loan pricing, enhance underwriting processes, and develop actionable risk
management tools for investors.
- **User Focus**: We aim to identify risk signals across different borrower
types and investigate factors related to financial health, while ensuring
fairness in risk assessment.

## Methodology and Dataset

Our study will utilize a quantitative approach, applying machine learning
techniques to a large-scale dataset.

### Dataset

The primary dataset for this research is the **Lending Club Loan Data** from
Kaggle, which contains comprehensive information on loans issued in the U.S.

### Modeling Approach

We will employ a systems thinking lens to understand the interconnected factors
influencing default risk. Our modeling process will involve:

- **Data Preparation**: Cleaning and preparing the Lending Club dataset.
- **Model Training**: Building and training predictive models.
- **Evaluation**: Assessing model performance using metrics like AUC-ROC.
- **Interpretation**: Using SHAP to understand the key features driving
predictions.

This structured approach will ensure our findings are both statistically robust
and practically applicable for investors and P2P platforms.
69 changes: 59 additions & 10 deletions 0_domain_study/guide.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,64 @@
# Domain Study: Guide

To do meaningful research in a domain, you need to learn what others already do
and don't understand in this area. Use this folder to organize your group's
understanding of your research domain including: your own summaries, helpful
PDFs, links you found helpful, ...
This `guide.md` file provides an orientation to the contents and structure of
the `0_domain_study` folder. This folder contains our group’s domain-level
understanding of the research question we are pursuing—centered around
financial inclusion, credit systems, and BNPL (Buy Now Pay Later).

This folder is different from `/notes` because it contains _only_ information
about your research domain. When deciding what goes here, ask yourself this
question: _Would someone need to know this to understand our research?_
Our research process is guided by systems thinking, where we understand each
topic not in isolation but as an interconnected part of the broader financial
and socio-economic system. This has shaped our analysis of how data, credit
scoring models, regulatory structures, user behaviors, and technological
interventions all influence financial outcomes.

## README.md
## Folder Structure and Contents

Use this folder's README to document all the notes and resources in this folder.
Someone shouldn't need to read through _everything_ to find what they need.
### 1. `research_questions/`

This folder documents the evolution of our central inquiry. It contains:

- `initial_questions.md`: A record of early-stage questions and themes we
explored through our literature review.

- `final_questions.md`: The final, actionable research questions that emerged
from both our literature review and group discussions. These are the
questions we are now pursuing using data science methods.

### 2. `sources/`

This folder contains the foundational materials we have studied, including:

- `sources.md`: A list of key academic papers, datasets, and articles that
informed our domain understanding. Each source includes a brief summary
and notes on how it influenced our thinking.

### 3. `README.md`

This file provides an overview of the entire `0_domain_study` folder. It
explains the research domain, links to relevant subfolders (questions,
sources), and gives an at-a-glance summary of our direction.

### 4. `guide.md` (this file)

You’re reading it! This file explains how to navigate the domain study
materials and how each file contributes to understanding our research domain.

## Our Approach

We apply systems thinking to structure our understanding of BNPL and its
intersection with financial inclusion. We focus on identifying:

- **Structural components**: Actors in the system (users, lenders, data
brokers, regulators)

- **Behavioral patterns**: How usage behaviors like BNPL adoption relate to
credit health

- **Purpose dynamics**: Who benefits from the system as currently designed?

Our research is rooted in real-world patterns but also reflects a personal
lens on financial vulnerability, particularly among young or underbanked
populations.

The deliverables from this folder will support Milestone 1 and guide the next
steps of our modeling, data acquisition, and analysis phases.
57 changes: 57 additions & 0 deletions 0_domain_study/research_questions/new/final_questions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Problem Overview and Final Research Framework

## Primary Research Question

What are the key borrower and loan characteristics that best predict default
risk in peer-to-peer (P2P) lending platforms in the United States?

## Refined Secondary Questions

1. Which machine learning approaches most accurately model default risk in P2P
lending data?
2. How do features such as credit grade, interest rate, debt-to-income ratio,
income, loan term, and loan purpose contribute to risk prediction?
3. How do default risk patterns change across time, different regions, or
borrower segments?
4. In what ways can advanced risk modeling support investor decisions and
improve P2P platform underwriting?

## Research Focus Areas

### Technical Focus

- Advanced data cleaning and feature selection tailored to the Lending Club
dataset
- Model benchmarking using logistic regression, random forest, and
gradient-boosted trees
- Application of SHAP/LIME for model explainability and feature impact analysis
- Temporal and geographic slicing for advanced pattern discovery

### Business Focus

- Leveraging model insights to improve loan pricing, selection, and underwriting
processes
- Developing actionable investor tools for portfolio risk management

### User Focus

- Identifying risk signals across borrower types
- Investigating financial health factors and demographic variance in risk
profiles
- Ensuring fairness in risk assessment across borrower populations

## Anticipated Outcomes

- Empirically validated predictors of P2P loan default risk
- Interpretable machine learning models with operational use-cases for investors
and platforms
- Recommendations for data-driven, transparent credit risk evaluation
- Guidelines for ongoing research and practical application

## Constraints and Ethics

- Data limitations, such as missing or incomplete records
- Ensuring results are actionable and interpretable for business and regulatory
purposes
- Attention to modeling bias, fairness, privacy, and responsible model
deployment
48 changes: 48 additions & 0 deletions 0_domain_study/research_questions/new/initial_questions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Initial Research Questions

This document summarizes the initial exploratory research questions that
informed the early direction of our study into credit risk and default
prediction in U.S. peer-to-peer (P2P) lending.

## Main Questions

1. What are the key borrower and loan characteristics that best predict default
risk in P2P lending platforms in the United States?
2. How effectively can machine learning models leverage these variables to
predict loan defaults on P2P platforms?
3. Which modeling techniques and data engineering approaches yield the most
robust and interpretable risk predictions?
4. How do macroeconomic, demographic, and geographic factors influence default
risk patterns among P2P borrowers?

## Supporting Questions

### Technical

- Which data cleaning and feature engineering steps are critical for preparing
P2P loan datasets?
- How do models such as logistic regression, random forest, and gradient
boosting compare for default prediction?
- What role do explainability frameworks (e.g., SHAP, LIME) play in model
transparency?

### Business

- How can P2P platforms enhance loan origination and pricing using predictive
risk analytics?
- What characteristics should investors prioritize when selecting loans for
their portfolios?

### User

- Which borrower demographics and financial behaviors most strongly correlate
with default risk?
- What patterns emerge when examining default outcomes by borrower or loan
attribute?

## Next Steps

- [ ] Obtain and preprocess the Lending Club dataset from Kaggle
- [ ] Perform exploratory data analysis (EDA) to identify predictive patterns
- [ ] Fit baseline and advanced machine learning models for risk prediction
- [ ] Analyze feature importance, refine models, and document findings
Loading
Loading