Skip to content

Commit

Permalink
Merge pull request #15 from daniel-tran/0.7.0
Browse files Browse the repository at this point in the history
Version 0.7.0
  • Loading branch information
daniel-tran committed Jan 25, 2023
2 parents 1c54b98 + 352ec5b commit 36605f2
Show file tree
Hide file tree
Showing 60 changed files with 1,006 additions and 170 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/run_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
# Omit macos-latest due to its high cost to run in GitHub Actions.
# Note that we can also test specific OS versions, e.g. windows-2016
os: [ubuntu-latest, windows-latest]
python-version: ['3.7', '3.8', '3.9']
python-version: ['3.8', '3.9', '3.10']
name: Build for ${{ matrix.os }} with Python ${{ matrix.python-version }}
steps:
- name: Checkout
Expand Down Expand Up @@ -59,6 +59,8 @@ jobs:
- name: Run XML unit tests
run: |
cd test
echo "Running Legacy XML file interface unit tests..."
python unit_tests_legacy_xml_file_interface.py
echo "Running XML file interface unit tests..."
python unit_tests_xml_file_interface.py
echo "Running XML Extractor unit tests..."
Expand Down
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Change Log

## 0.7.0
- Refined xml_file_interface to use a more standard XML document structure with easier integration with XSLT
- Added `use_legacy_mode` flag to XML Downloader and Extractor to continue using the original behaviour and assist with transitioning to the updated XML file interface
- Added legacy_xml_file_interface module for backward compatibility with XML files using the previous (deprecated) document structure
- Added download timestamp and library version information to output files
- Drop library support for Python 3.7

## 0.6.1
- Fixed an issue where the JSON file interface was writing Unicode characters incorrectly to output files

Expand Down
15 changes: 14 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,17 @@ Implementation details specifically geared for performance optimisation, such as
# Development Guide
Some parts of the library have a repeatable process for adding on new features and such, which are documented below:

## Introducing breaking changes
When modifying some behaviour that no longer ensures backward compatibility with the previous release, there are two recommended approaches:

1. Modified behaviour is now the default expectation, so users wanting the old behaviour have to "opt out".
2. Old behaviour is still the default expectation, so users wanting the modified behaviour have to "opt in".

At minimum, the support for both old and new behaviour should be maintained for at least one minor release. Afterwards, contributors have two options:

1. The old behaviour can be formally removed in the next major version.
2. The old behaviour is maintained along with the new behaviour into the foreseeable future.

## Supporting a new translations
- Add a new translation code to the appropriate method under `meaningless\utilities\common.py`.
- Add a new test case to `system_tests_bible_translations.py` for the new translation. This is used to validate end-to-end correctness.
Expand Down Expand Up @@ -101,7 +112,9 @@ This is the dictionary structure that is passed into `write` and returned from `
},
"Info": {
"Language": "Translation Language",
"Translation": "Translation Code"
"Translation": "Translation Code",
"Timestamp": "Timestamp in ISO 8601 format",
"Meaningless": "Version of Meaningless the file was downloaded from"
}
}
```
156 changes: 129 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,15 +99,17 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.yaml` in the current working directory with the following contents:
```
Running the above code would produce a file called `Ecclesiastes.yaml` in the current working directory with the approximate contents:
```yaml
Ecclesiastes:
1:
2: "² “Meaningless! Meaningless!”\n says the Teacher.\n“Utterly meaningless!\n\
\ Everything is meaningless.”"
Info:
Language: English
Translation: NIV
Timestamp: '0000-00-00T00:00:00.000000+00:00'
Meaningless: 0.0.0
```
## YAML Extractor
Expand Down Expand Up @@ -144,15 +146,17 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.yaml` in the current working directory with the following contents:
```python
Running the above code would produce a file called `Ecclesiastes.yaml` in the current working directory with the approximate contents:
```yaml
Ecclesiastes:
1:
2: "² “Meaningless! Meaningless!”\n says the Teacher.\n“Utterly meaningless!\n\
\ Everything is meaningless.”"
Info:
Language: English
Translation: NIV
Timestamp: '0000-00-00T00:00:00.000000+00:00'
Meaningless: 0.0.0
Customised?: true
```
Expand All @@ -167,8 +171,8 @@ if __name__ == '__main__':
```
Output:
Running the above code would produce a file called `Ecclesiastes.json` in the current working directory with the following contents:
```
Running the above code would produce a file called `Ecclesiastes.json` in the current working directory with the approximate contents:
```json
{
"Ecclesiastes": {
"1": {
Expand All @@ -177,6 +181,8 @@ Running the above code would produce a file called `Ecclesiastes.json` in the cu
},
"Info": {
"Language": "English",
"Meaningless": "0.0.0",
"Timestamp": "0000-00-00T00:00:00.000000+00:00",
"Translation": "NIV"
}
}
Expand Down Expand Up @@ -216,8 +222,8 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.json` in the current working directory with the following contents:
```python
Running the above code would produce a file called `Ecclesiastes.json` in the current working directory with the approximate contents:
```json
{
"Ecclesiastes": {
"1": {
Expand All @@ -227,6 +233,8 @@ Running the above code would produce a file called `Ecclesiastes.json` in the cu
"Info": {
"Customised?": true,
"Language": "English",
"Meaningless": "0.0.0",
"Timestamp": "0000-00-00T00:00:00.000000+00:00",
"Translation": "NIV"
}
}
Expand All @@ -243,13 +251,105 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the following contents:
Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the approximate contents:
```xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<info>
<language>English</language>
<translation>NIV</translation>
<timestamp>0000-00-00T00:00:00.000000+00:00</timestamp>
<meaningless>0.0.0</meaningless>
</info>
<book name="Ecclesiastes" tag="_Ecclesiastes">
<chapter number="1" tag="_1">
<passage number="2" tag="_2">² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”</passage>
</chapter>
</book>
</root>
```

## XML Extractor
Much like the YAML Extractor, the XML Extractor uses the generated files from the XML Downloader to find passages.
```python
from meaningless import XMLExtractor
if __name__ == '__main__':
bible = XMLExtractor()
passage = bible.get_passage('Ecclesiastes', 1, 2)
print(passage)
```
Output:

Assuming the XML downloader has already generated an XML file in the current directory called `Ecclesiastes.xml` which contains the book of Ecclesiastes in XML format:
```
² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”
```

## XML File Interface
The XML File Interface is a set of helper methods used to read and write XML files. Unlike the other file interfaces, this is more geared towards the specific document format used by the XML Downloader and Extractor, so you may observe some strange behaviour if you try using this for general purpose XML file interactions.
```python
from meaningless import XMLDownloader, xml_file_interface
if __name__ == '__main__':
downloader = XMLDownloader()
downloader.download_passage('Ecclesiastes', 1, 2)
bible = xml_file_interface.read('./Ecclesiastes.xml')
bible['Info']['Customised'] = True
xml_file_interface.write('./Ecclesiastes.xml', bible)
```
Output:

Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the approximate contents:
```xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<info>
<language>English</language>
<translation>NIV</translation>
<timestamp>0000-00-00T00:00:00.000000+00:00</timestamp>
<meaningless>0.0.0</meaningless>
<customised>true</customised>
</info>
<book name="Ecclesiastes" tag="_Ecclesiastes">
<chapter number="1" tag="_1">
<passage number="2" tag="_2">² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”</passage>
</chapter>
</book>
</root>
```

**Note that you are allowed to write badly formed XML documents using this file interface, but they will cause runtime errors in your code upon trying to read and process them.**

## Legacy XML Downloader
The Legacy XML Downloader is effectively the same as the XML Downloader prior to version 0.7.0.
```python
from meaningless import XMLDownloader
if __name__ == '__main__':
downloader = XMLDownloader(use_legacy_mode=True)
downloader.download_passage('Ecclesiastes', 1, 2)
```
Output:

Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the approximate contents:
```xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<Info>
<Language>English</Language>
<Translation>NIV</Translation>
<Timestamp>0000-00-00T00:00:00.000000+00:00</Timestamp>
<Meaningless>0.0.0</Meaningless>
</Info>
<Ecclesiastes>
<_1>
Expand All @@ -268,47 +368,49 @@ Note that the following adjustments are made to the downloaded contents to ensur
2. All tag names starting with a number are prefixed.
3. Tags corresponding to book names use a placeholder character for spaces.

## XML Extractor
Much like the YAML Extractor, the XML Extractor uses the generated files from the XML Downloader to find passages.
## Legacy XML Extractor
The Legacy XML Extractor is effectively the same as the XML Downloader prior to version 0.7.0, and as such, only supports processing of XML files from versions prior to 0.7.0 or produced by the Legacy XML File Interface
```python
from meaningless import XMLExtractor
if __name__ == '__main__':
bible = XMLExtractor()
bible = XMLExtractor(use_legacy_mode=True)
passage = bible.get_passage('Ecclesiastes', 1, 2)
print(passage)
```
Output:

Assuming the XML downloader has already generated a XML file in the current directory called `Ecclesiastes.xml` which contains the book of Ecclesiastes in XML format:
Assuming the Legacy XML downloader has already generated a XML file in the current directory called `Ecclesiastes.xml` which contains the book of Ecclesiastes in XML format:
```
² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”
```

## XML File Interface
The XML File Interface is a set of helper methods used to read and write XML files. Unlike the other file interfaces, this is more geared towards the XML document format used by the XML Downloader and Extractor, so you may observe some strange behaviour if you try using this for general purpose XML file interactions.
## Legacy XML File Interface
The Legacy XML File Interface is a set of helper methods used to read and write XML files using the document structure prior to version 0.7.0. You may observe some strange behaviour if you try using this for general purpose XML file interactions, so it is only recommended for use with files produced by the Legacy XML Downloader.
```python
from meaningless import XMLDownloader, xml_file_interface
from meaningless import XMLDownloader, legacy_xml_file_interface
if __name__ == '__main__':
downloader = XMLDownloader()
downloader = XMLDownloader(use_legacy_mode=True)
downloader.download_passage('Ecclesiastes', 1, 2)
bible = xml_file_interface.read('./Ecclesiastes.xml')
bible = legacy_xml_file_interface.read('./Ecclesiastes.xml')
bible['Info']['Customised'] = True
xml_file_interface.write('./Ecclesiastes.xml', bible)
legacy_xml_file_interface.write('./Ecclesiastes.xml', bible)
```
Output:

Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the following contents:
Running the above code would produce a file called `Ecclesiastes.xml` in the current working directory with the approximate contents:
```xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<Info>
<Language>English</Language>
<Translation>NIV</Translation>
<Timestamp>0000-00-00T00:00:00.000000+00:00</Timestamp>
<Meaningless>0.0.0</Meaningless>
<Customised>true</Customised>
</Info>
<Ecclesiastes>
Expand All @@ -335,13 +437,13 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.csv` in the current working directory with the following contents:
Running the above code would produce a file called `Ecclesiastes.csv` in the current working directory with the approximate contents:
```
Book,Chapter,Passage,Text,Language,Translation
Book,Chapter,Passage,Text,Language,Translation,Timestamp,Meaningless
Ecclesiastes,1,2,"² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”",English,NIV
Everything is meaningless.”",English,NIV,0000-00-00T00:00:00.000000+00:00,0.0.0
```

## CSV Extractor
Expand All @@ -365,7 +467,7 @@ Assuming the CSV downloader has already generated a CSV file in the current dire
```

## CSV File Interface
The CSV File Interface is a set of helper methods used to read and write CSV files. Like the XML File Interface, this is geared towards the CSV document format used by the CSV Downloader and Extractor and cannot be used to add custom attributes to the output file when writing CSV data.
The CSV File Interface is a set of helper methods used to read and write CSV files. This is geared towards the CSV document format used by the CSV Downloader and Extractor and cannot be used to add custom attributes to the output file when writing CSV data.
```python
from meaningless import CSVDownloader, csv_file_interface
Expand All @@ -378,13 +480,13 @@ if __name__ == '__main__':
```
Output:

Running the above code would produce a file called `Ecclesiastes.csv` in the current working directory with the following contents:
Running the above code would produce a file called `Ecclesiastes.csv` in the current working directory with the approximate contents:
```
Book,Chapter,Passage,Text,Language,Translation
Book,Chapter,Passage,Text,Language,Translation,Timestamp,Meaningless
Ecclesiastes,1,2,"² “Meaningless! Meaningless!”
says the Teacher.
“Utterly meaningless!
Everything is meaningless.”",English (EN),NIV
Everything is meaningless.”",English (EN),NIV,0000-00-00T00:00:00.000000+00:00,0.0.0
```

## Text searching within files
Expand Down
1 change: 0 additions & 1 deletion VERSION.txt

This file was deleted.

2 changes: 1 addition & 1 deletion docs/_static/documentation_options.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
var DOCUMENTATION_OPTIONS = {
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
VERSION: '0.6.1',
VERSION: '0.7.0',
LANGUAGE: 'None',
COLLAPSE_INDEX: false,
BUILDER: 'html',
Expand Down
Loading

0 comments on commit 36605f2

Please sign in to comment.