Compare commits

..

11 Commits

Author SHA1 Message Date
Sergio Martínez Portela
93e2001400 Merge remote-tracking branch 'origin/develop' into support-updating-raw-note-contents
All checks were successful
Testing / pytest (push) Successful in 24s
Testing / mypy (push) Successful in 34s
Testing / style-formatting (push) Successful in 28s
Testing / style-sorted-imports (push) Successful in 27s
Testing / stability-extra-test (push) Successful in 28s
2024-10-07 23:26:50 +02:00
Sergio Martínez Portela
15af4212ae Apply formatting scripts.
All checks were successful
Testing / pytest (push) Successful in 27s
Testing / mypy (push) Successful in 29s
Testing / style-formatting (push) Successful in 29s
Testing / style-sorted-imports (push) Successful in 23s
Testing / stability-extra-test (push) Successful in 28s
2024-10-07 19:48:16 +02:00
Sergio Martínez Portela
56416f2fd8 Merge remote-tracking branch 'origin/develop' into support-updating-raw-note-contents
Some checks failed
Testing / mypy (push) Waiting to run
Testing / style-formatting (push) Waiting to run
Testing / style-sorted-imports (push) Waiting to run
Testing / stability-extra-test (push) Waiting to run
Testing / pytest (push) Has been cancelled
2024-10-07 19:47:23 +02:00
Sergio Martínez Portela
97db5686d8 Tag structural elements.
All checks were successful
Testing / pytest (push) Successful in 25s
Testing / mypy (push) Successful in 1m3s
Testing / stability-extra-test (push) Successful in 1m2s
2024-07-30 17:43:46 +02:00
Sergio Martínez Portela
c7f78e0a6c Add reparse test line ordering of reparsing.
Some checks failed
Testing / pytest (push) Failing after 27s
Testing / mypy (push) Successful in 1m6s
Testing / stability-extra-test (push) Successful in 1m6s
2024-07-30 17:15:29 +02:00
Sergio Martínez Portela
ef615feac5 Allow trailing whitespace changes on raw content update.
All checks were successful
Testing / pytest (push) Successful in 28s
Testing / mypy (push) Successful in 31s
Testing / stability-extra-test (push) Successful in 31s
2024-07-30 10:52:37 +02:00
Sergio Martínez Portela
d0498d2f5b Add support for data parse on headline content update. 2024-07-30 10:52:29 +02:00
Sergio Martínez Portela
5de339570d Add (failing) test for reparse on headline update. 2024-07-30 10:52:29 +02:00
Sergio Martínez Portela
071004ea7a Quick fix for handling date data.
All checks were successful
Testing / pytest (push) Successful in 24s
Testing / mypy (push) Successful in 33s
Testing / stability-extra-test (push) Successful in 24s
2024-07-29 15:36:57 +01:00
Sergio Martínez Portela
f5cafa9c62 Merge remote-tracking branch 'origin/develop' into support-updating-raw-note-contents 2024-07-29 15:35:20 +01:00
Sergio Martínez Portela
134d872ca6 wip: First approach to get going.
Some checks failed
Testing / pytest (push) Successful in 27s
Testing / mypy (push) Failing after 32s
Testing / stability-extra-test (push) Successful in 22s
2024-07-22 23:31:19 +01:00
9 changed files with 186 additions and 73 deletions

View File

@ -9,8 +9,8 @@ jobs:
- name: Check out repository code
uses: actions/checkout@v3
- run: apt-get update && apt-get install -y python3-pip
- run: pip install --break-system-package -e .
- run: pip install --break-system-package pytest
- run: pip install -e .
- run: pip install pytest
- run: pytest
mypy:
@ -19,8 +19,8 @@ jobs:
- name: Check out repository code
uses: actions/checkout@v3
- run: apt-get update && apt-get install -y python3-pip
- run: pip install --break-system-package -e .
- run: pip install --break-system-package mypy
- run: pip install -e .
- run: pip install mypy
- run: mypy org_rw --check-untyped-defs
style-formatting:
@ -29,8 +29,8 @@ jobs:
- name: Check out repository code
uses: actions/checkout@v3
- run: apt-get update && apt-get install -y python3-pip
- run: pip install --break-system-package -e .
- run: pip install --break-system-package black
- run: pip install -e .
- run: pip install black
- run: black --check .
style-sorted-imports:
@ -39,8 +39,8 @@ jobs:
- name: Check out repository code
uses: actions/checkout@v3
- run: apt-get update && apt-get install -y python3-pip
- run: pip install --break-system-package -e .
- run: pip install --break-system-package isort
- run: pip install -e .
- run: pip install isort
- run: isort --profile black --check .
stability-extra-test:
@ -49,5 +49,5 @@ jobs:
- name: Check out repository code
uses: actions/checkout@v3
- run: apt-get update && apt-get install -y git-core python3-pip
- run: pip install --break-system-package -e .
- run: pip install -e .
- run: bash extra-tests/check_all.sh

View File

@ -7,12 +7,6 @@ A python library to parse, modify and save Org-mode files.
- Modify these data and write it back to disk.
- Keep the original structure intact (indentation, spaces, format, ...).
** Principles
- Avoid any dependency outside of Python's standard library.
- Don't do anything outside of the scope of parsing/re-serializing Org-mode files.
- *Modification of the original text if there's no change is considered a bug (see [[id:7363ba38-1662-4d3c-9e83-0999824975b7][Known issues]]).*
- Data structures should be exposed as it's read on Emacs's org-mode or when in doubt as raw as possible.
- Data in the objects should be modificable as a way to update the document itself. *Consider this a Object-oriented design.*
** Safety mechanism
As this library is still in early development. Running it over files might
produce unexpected changes on them. For this reason it's heavily recommended to
@ -27,9 +21,6 @@ Also, see [[id:76e77f7f-c9e0-4c83-ad2f-39a5a8894a83][Known issues:Structure modi
not properly stored and can trigger this safety mechanism on a false-positive.
* Known issues
:PROPERTIES:
:ID: 7363ba38-1662-4d3c-9e83-0999824975b7
:END:
** Structure modifications
:PROPERTIES:
:ID: 76e77f7f-c9e0-4c83-ad2f-39a5a8894a83

View File

@ -24,14 +24,6 @@ class ResultsDrawerNode(DrawerNode):
return "<Results: {}>".format(len(self.children))
class GenericDrawerNode(DrawerNode):
def __init__(self, drawer_name):
self.drawer_name = drawer_name
def __repr__(self):
return "<Drawer{}: {}>".format(self.drawer_name, len(self.children))
class PropertyNode:
def __init__(self, key, value):
self.key = key
@ -70,18 +62,12 @@ class TableSeparatorRow:
def __init__(self, orig=None):
self.orig = orig
def get_raw(self):
return get_raw_contents(self.orig)
class TableRow:
def __init__(self, cells, orig=None):
self.cells = cells
self.orig = orig
def get_raw(self):
return get_raw_contents(self.orig)
class Text:
def __init__(self, content):

View File

@ -122,7 +122,6 @@ NON_FINISHED_GROUPS = (
dom.ListGroupNode,
dom.ResultsDrawerNode,
dom.PropertyDrawerNode,
dom.GenericDrawerNode,
)
FREE_GROUPS = (dom.CodeBlock,)
@ -415,7 +414,6 @@ class Headline:
if (
isinstance(line, DelimiterLine)
and line.delimiter_type == DelimiterLineType.END_BLOCK
and line.type_data.subtype == current_node.header.type_data.subtype
):
start = current_node.header.linenum
@ -638,13 +636,6 @@ class Headline:
assert current_node is None
current_node = dom.ResultsDrawerNode()
# TODO: Allow indentation of these blocks inside others
indentation_tree = [current_node]
tree.append(current_node)
elif content.strip().startswith(":") and content.strip().endswith(":"):
assert current_node is None
current_node = dom.GenericDrawerNode(content.strip().strip(":"))
# TODO: Allow indentation of these blocks inside others
indentation_tree = [current_node]
tree.append(current_node)
@ -873,24 +864,9 @@ class Headline:
yield from get_links_from_content(item.content)
def get_lines_between(self, start, end):
# @TODO: Generalize for other line types too.
everything = (
[]
# + self.keywords
+ self.contents
# + self.list_items
# + self.table_rows
# + self.properties
# + self.structural
+ self.delimiters
)
for line in everything:
for line in self.contents:
if start <= line.linenum < end:
if "get_raw" in dir(line):
yield "".join(line.get_raw())
else:
yield line.line
yield "".join(line.get_raw())
def get_contents(self, format):
if format == "raw":
@ -901,6 +877,46 @@ class Headline:
else:
raise NotImplementedError()
def update_raw_contents(self, new_contents):
# Clear elements
self.keywords = []
self.contents = []
self.list_items = []
self.table_rows = []
self.properties = []
self.structural = []
self.delimiters = []
self.scheduled = None
self.deadline = None
self.closed = None
reader = OrgDocReader(environment=self.doc.environment)
reader.read(new_contents)
# No need to finalize as we can take the data from the reader instead of from a doc
if len(reader.headlines) > 0:
# Probably can be done by just adding the headlines to this one's children
raise NotImplementedError(
"new headlines on raw contents not supported yet. This probably should be simple, see comment on code."
)
for kw in reader.keywords:
self.keywords.append(offset_linenum(self.start_line + 1, kw))
for content in reader.contents:
self.contents.append(offset_linenum(self.start_line + 1, content))
for list_item in reader.list_items:
self.list_items.append(offset_linenum(self.start_line + 1, list_item))
for struct_item in reader.structural:
self.structural.append(offset_linenum(self.start_line + 1, struct_item))
for prop in reader.properties:
self.properties.append(offset_linenum(self.start_line + 1, prop))
# Environment is not used, as it's known
def get_element_in_line(self, linenum):
for line in self.contents:
if linenum == line.linenum:
@ -1098,6 +1114,7 @@ Keyword = collections.namedtuple(
Property = collections.namedtuple(
"Property", ("linenum", "match", "key", "value", "options")
)
Structural = collections.namedtuple("Structural", ("linenum", "line"))
class ListItem:
@ -1146,6 +1163,19 @@ TableRow = collections.namedtuple(
),
)
ItemWithLineNum = Union[Keyword, RawLine, Property, ListItem, Structural]
def offset_linenum(offset: int, item: ItemWithLineNum) -> ItemWithLineNum:
if isinstance(item, ListItem):
item.linenum += offset
return item
assert isinstance(
item, (Keyword, RawLine, Property, Structural)
), "Expected (Keyword|RawLine|Property|Structural), found {}".format(item)
return item._replace(linenum=item.linenum + offset)
# @TODO How are [YYYY-MM-DD HH:mm--HH:mm] and ([... HH:mm]--[... HH:mm]) differentiated ?
# @TODO Consider recurrence annotations
@ -2331,7 +2361,6 @@ class OrgDoc:
list_items,
structural,
properties,
delimiters,
environment=BASE_ENVIRONMENT,
):
self.todo_keywords = [HeadlineState(name=kw) for kw in DEFAULT_TODO_KEYWORDS]
@ -2361,11 +2390,11 @@ class OrgDoc:
self.list_items: List[ListItem] = list_items
self.structural: List = structural
self.properties: List = properties
self.delimiters: List = delimiters
self._path = None
self.headlines: List[Headline] = list(
map(lambda hl: parse_headline(hl, self, self), headlines)
)
self.environment = environment
@property
def id(self):
@ -2526,9 +2555,6 @@ class OrgDoc:
for struct in self.structural:
lines.append(dump_structural(struct))
for content in self.delimiters:
lines.append(dump_delimiters(content))
for kw in self.keywords:
lines.append(dump_kw(kw))
@ -2553,8 +2579,8 @@ class OrgDocReader:
self.delimiters: List[DelimiterLine] = []
self.list_items: List[ListItem] = []
self.table_rows: List[TableRow] = []
self.structural: List = []
self.properties: List = []
self.structural: List[Structural] = []
self.properties: List[Property] = []
self.current_drawer: Optional[List] = None
self.environment = environment
@ -2566,7 +2592,6 @@ class OrgDocReader:
self.list_items,
self.structural,
self.properties,
self.delimiters,
self.environment,
)
@ -2737,7 +2762,7 @@ class OrgDocReader:
def add_property_drawer_line(self, linenum: int, line: str, match: re.Match):
if len(self.headline_hierarchy) == 0:
self.current_drawer = self.properties
self.structural.append((linenum, line))
self.structural.append(Structural(linenum, line))
else:
assert self.headline_hierarchy[-1] is not None
self.current_drawer = self.headline_hierarchy[-1]["properties"]
@ -2756,7 +2781,7 @@ class OrgDocReader:
def add_drawer_end_line(self, linenum: int, line: str, match: re.Match):
self.current_drawer = None
if len(self.headline_hierarchy) == 0:
self.structural.append((linenum, line))
self.structural.append(Structural(linenum, line))
else:
assert self.headline_hierarchy[-1] is not None
self.headline_hierarchy[-1]["structural"].append((linenum, line))

View File

@ -9,7 +9,6 @@ from .org_rw import (
ListItem,
RawLine,
Strike,
TableRow,
Text,
Underlined,
Verbatim,
@ -51,8 +50,6 @@ def get_raw_contents(doc) -> str:
return doc.get_raw()
if isinstance(doc, ListItem):
return dump_contents(doc)[1]
if isinstance(doc, TableRow):
return dump_contents(doc)[1]
print("Unhandled type: " + str(doc))
raise NotImplementedError("Unhandled type: " + str(doc))

1
requirements.txt Normal file
View File

@ -0,0 +1 @@
# No external requirements at this point

View File

@ -0,0 +1,22 @@
#+TITLE: 13-Update reparse
#+DESCRIPTION: Update-Reparse org file
#+TODO: TODO(t) PAUSED(p) | DONE(d)
* First level
:PROPERTIES:
:ID: 13-update-reparse-first-level-id
:CREATED: [2020-01-01 Wed 01:01]
:END:
First level content
- A list of items ::
- With a sublist
Something after the list.
** Second level
:PROPERTIES:
:ID: 13-update-reparse-second-level-id
:END:
Second level content

View File

@ -1,4 +1,5 @@
import os
import tempfile
import unittest
from datetime import datetime as DT
@ -869,6 +870,86 @@ class TestSerde(unittest.TestCase):
self.assertEqual(dumps(doc), orig)
def test_update_reparse_same_structure(self):
with open(os.path.join(DIR, "01-simple.org")) as f:
doc = load(f)
hl = doc.getTopHeadlines()[0]
ex = HL(
"First level",
props=[
("ID", "01-simple-first-level-id"),
("CREATED", DT(2020, 1, 1, 1, 1)),
],
content=" First level content\n",
children=[
HL(
"Second level",
props=[("ID", "01-simple-second-level-id")],
content="\n Second level content\n",
children=[
HL(
"Third level",
props=[("ID", "01-simple-third-level-id")],
content="\n Third level content\n",
)
],
)
],
)
# Ground check
ex.assert_matches(self, hl)
# Update
lines = list(doc.dump_headline(hl, recursive=False))
assert lines[0].startswith("* ") # Title, skip it
content = "\n".join(lines[1:])
hl.update_raw_contents(content)
# Check after update
ex.assert_matches(self, hl, accept_trailing_whitespace_changes=True)
def test_update_reparse_same_values(self):
with open(os.path.join(DIR, "13-update-reparse-test.org")) as f:
doc = load(f)
expected_hl_contents = """ :PROPERTIES:
:ID: 13-update-reparse-first-level-id
:CREATED: [2020-01-01 Wed 01:01]
:END:
First level content
- A list of items ::
- With a sublist
Something after the list.
"""
hl = doc.getTopHeadlines()[0]
lines = list(doc.dump_headline(hl, recursive=False))
assert lines[0].startswith("* ") # Title, skip it
content = "\n".join(lines[1:])
self.assertEqual(content, expected_hl_contents)
# Check after update
hl.update_raw_contents(content)
self.assertEqual(content, expected_hl_contents)
# Check after dump and reload
with tempfile.NamedTemporaryFile("wt") as f:
save = org_rw.dumps(doc)
f.write(save)
f.flush()
with open(f.name, "rt") as reader:
reloaded = org_rw.load(reader)
re_hl = reloaded.getTopHeadlines()[0]
lines = list(doc.dump_headline(hl, recursive=False))
assert lines[0].startswith("* ") # Title, skip it
content = "\n".join(lines[1:])
self.assertEqual(content, expected_hl_contents)
def test_mimic_write_file_13(self):
with open(os.path.join(DIR, "13-tags.org")) as f:
orig = f.read()

View File

@ -67,7 +67,12 @@ class HL:
self.content = content
self.children = children
def assert_matches(self, test_case: unittest.TestCase, doc):
def assert_matches(
self,
test_case: unittest.TestCase,
doc,
accept_trailing_whitespace_changes=False,
):
test_case.assertEqual(self.title, get_raw(doc.title))
# Check properties
@ -84,7 +89,12 @@ class HL:
timestamp_to_datetime(doc_props[i].value), prop[1]
)
test_case.assertEqual(get_raw_contents(doc), self.get_raw())
if accept_trailing_whitespace_changes:
test_case.assertEqual(
get_raw_contents(doc).rstrip(), self.get_raw().rstrip()
)
else:
test_case.assertEqual(get_raw_contents(doc), self.get_raw())
# Check children
if self.children is None: