Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrd-kraken-segment creates negative coordinates (=invalid PAGE) #34

Open
stefanCCS opened this issue May 9, 2022 · 2 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@stefanCCS
Copy link

Hi,

I have an example, where ocrd-kraken-segment creates negative coordinates (=invalid PAGE).
I just have used:

ocrd resmgr download ocrd-kraken-segment blla.mlmodel
ocrd-kraken-segment -I <inputFileGrp> -O <outputFileGrp>

example.zip
As Result I can see:

<pc:TextRegion id="region_line_36">
            <pc:Coords points="3040,382 3040,-2 3219,-2 3219,382 3216,575 3037,569"/>

@kba kba added the bug Something isn't working label May 9, 2022
@kba kba self-assigned this May 9, 2022
@bertsky
Copy link
Collaborator

bertsky commented May 25, 2023

It's not behaving this way, anymore. Since #33, we clip all resulting polygons to the Border/canvas.

But unfortunately, in this case, the raw polygons from Kraken yield trouble when dealing with Shapely:

INFO kraken.blla - Vectorizing regions
INFO kraken.blla - Vectorizing baselines
...
  File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/ocrd_kraken/segment.py", line 85, in process
    res = self.segmenter(page_image)
  File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/ocrd_kraken/segment.py", line 55, in segmenter
    return segment(img, **kwargs)
  File "/data/ocr-d/kraken/kraken/blla.py", line 315, in segment
    topline=net.user_metadata['topline'] if 'topline' in net.user_metadata else False)
  File "/data/ocr-d/kraken/kraken/blla.py", line 210, in vec_lines
    pol = calculate_polygonal_environment(baselines=[bl[1]], im_feats=im_feats, suppl_obj=suppl_obj, topline=topline)
  File "/data/ocr-d/kraken/kraken/lib/segmentation.py", line 710, in calculate_polygonal_environment
    bounds))
  File "/data/ocr-d/kraken/kraken/lib/segmentation.py", line 551, in _extract_patch
    polygon = np.array(roi_polygon.intersection(polygon).boundary.coords, dtype=int)
  File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/shapely/geometry/base.py", line 582, in intersection
    return shapely.intersection(self, other, grid_size=grid_size)
  File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/shapely/decorators.py", line 77, in wrapped
    return func(*args, **kwargs)
  File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/shapely/set_operations.py", line 133, in intersection
    return lib.intersection(a, b, **kwargs)
shapely.errors.GEOSException: TopologyException: Input geom 1 is invalid: Self-intersection at 528.85981308411215 126.10280373831776

I guess this is caused by mittagessen/kraken#319 (I have been using shapely 2.0.1 here.)

@bertsky
Copy link
Collaborator

bertsky commented May 25, 2023

It does help to downgrade shapely to 1.8.5.post1, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants