# HiTab
## QA
### Pipeline

+ Entity link for NSMinput data.  
**output:** ["processed_input/{tables.jsonl|train_examples.jsonl|dev_samples.jsonl|test_examples.jsonl}"]
```
python qa/datadump/entity_link.py
```
+ Random explore consistent programs as warm-up.  
**output:** ["raw_input/explore/saved_programs.json"]  
```
# weakly supervised
python qa/table/random_explore.py  \
    --output_dir /data/home/hdd3000/USER/HMT/qa/data/raw_input/ \
    --experiment_name explore/ \
    --table_file /data/home/hdd3000/USER/HMT/qa/data/processed_input/tables.jsonl \
    --train_file_tmpl /data/home/hdd3000/USER/HMT/qa/data/processed_input/no_split/train_split_shard_90-{}.jsonl \
    --executor hmt \
    --trigger_word_file /data/home/hdd3000/USER/HMT/qa/data/raw_input/trigger_word_hmtqa.json \
    --n_epoch 300 --save_every_n 10 --id_start 0 --id_end 90  \
    --saved_programs_file saved_programs.json  \
    --allow_union_header 
```
```
# partially supervised
python qa/table/random_explore.py  \
    --output_dir /data/home/hdd3000/USER/HMT/qa/data/raw_input/ \
    --experiment_name explore/ \
    --table_file /data/home/hdd3000/USER/HMT/qa/data/processed_input/tables.jsonl \
    --train_file_tmpl /data/home/hdd3000/USER/HMT/qa/data/processed_input/no_split/train_split_shard_90-{}.jsonl \
    --executor hmt \
    --trigger_word_file /data/home/hdd3000/USER/HMT/qa/data/raw_input/no_trigger.json \
    --n_epoch 300 --save_every_n 10 --id_start 0 --id_end 90  \
    --saved_programs_file saved_programs.json  \
    --allow_union_header  \
    --use_schema_link  \
    --alpha_region 0.2  \
    --alpha_op  0.2  \
    --alpha_schema_link 0.2  \
    --total_reward_threshold 1.4
```
+ Train HiTab-QA using NSM framework.  
**output:** ["qa/runs/hmtqa/"]
```
./train_hmt.sh
```
+ Test HiTab-QA using NSM framework.  
**output:** ["qa/runs/hmtqa/"]
```
./test_hmt.sh
```
+ Train HiTab-QA using tapas.  
**output:** ["qa/data/raw_input/tapas_data/checkpoints/"]
```
python train_tapas.py
```
+ Test HiTab-QA using tapas.  
```
python test_tapas.py
```



Chang `qa/config/` configurations to run different learning algorithms `['mml', 'sample', 'mapo']`, where `sample` is REINFORCE.





## Data2Text

See README in `data2text/`.



