用 Python 提取《釜山行》人物关系
Network Mining Based on Co-occurrenceMain Points
IntroductionPrerequistesWorkFlowsScriptsBatch Renaming FilenameMain PointsIntroductionPrerequistesWorkFlowsCodesReferenceWorking LogMain PointsIntroductionPrerequistesWorkFlowsCodesPython ToolBoxsFormatted DataBatch Get Download Link
Network Mining Based on Co-occurrence
Main Points
- Network based on conventional co-occurrence methods
- Capture structured data from unstructured data set
Introduction
Generate network based on co-occurrence was proposed several decades ago, however, it still occupies most of papers talking about network discovering. You can exploit structured data network, use them to generate a graph from a praph text , online text or even video.
Prerequistes
WorkFlows
1. Entity Identification (determine the set of nodes )
Generate a network for entity set from a given data set, in fact, in some few cases generating a network for a movie like the example above, very few main entites appear in a movie ,wo we can get their identifiaction for the web or generate them for yourselves.
regress method (binary classification)
SVM (the characteristics for nodes)
deep learning algorithm(vonventional nerual network)
2. Relationship Identification
This project will generate the relationships between two nodes based on the methods ,that what just methioned above, the convention co-occurrence methods. the block of codes will build an edge for two nodes if they occur in a same paragraph.If there always been an edge for two nodes, the weight of that edge will be increased. Once the data set is big enough, the main line of the data set will appear.
Scripts
Note:
the co-occurrence methond only is applicable for the data set that have obvious centralization, edge with lower weight will always be redundant. two method will be appied to reduce the redundancy degree
- The first way is filter
- The second way is segmenting your network.
Batch Renaming Filename
Main Points
- **
- **
- **
Introduction
Prerequistes
WorkFlows
1. **
2. **
3. **
Codes
Reference
map()
string.replace()
zip()
delimiter.join(list_you_want_to_join)
"str1"+"str2"
re.findall(pat,tex)
Working Log
Main Points
- **
- **
- **
Introduction
Prerequistes
WorkFlows
1. **
2. **
3. **
Codes
Python ToolBoxs
docx第三方库
Formatted Data

Batch Get Download Link
Python 内置的 argparse 库,这个库可以让你以命令行地方式来运行 Python 程序