Add PaddleDetection-based Layout Model #54

an1018 · 2021-08-06T06:56:27Z

Hi, I have reorganized the file structure of Paddle.
Please review.Thx

littletomatodonkey · 2021-08-06T06:58:00Z

setup.py

@@ -27,12 +27,22 @@
        "torch",
        "torchvision",
        "iopath",
+        "tqdm",


这个去掉吧，安装paddleocr的话，会自动安装这个模块的

littletomatodonkey · 2021-08-06T06:59:51Z

src/layoutparser/models/paddleDetection/layoutmodel.py

+
+        if not enforce_cpu:
+            # initial GPU memory(M), device ID
+            config.enable_use_gpu(200, 0)


改为2000吧

lolipopshock · 2021-08-06T15:14:23Z

Thanks for this PR. Here are some general questions/thoughts:

Please remove the OCR part from this PR.
Please use PEP8 for as the variable naming conventions -- you could use black and pylint for checking.
For the preprocess.py and layoutmodel.py file
1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
2. And more fundamentally, why the preprocess function should live within layoutparser rather than paddledetection? I think it's paddledetection's obligation to do the preprocessing for the models.
Also could you add tests for the paddle modules? You could follow the examples in test_model.

an1018 · 2021-08-07T02:40:20Z

Thanks for this PR. Here are some general questions/thoughts:

* Please remove the OCR part from this PR.

* Please use PEP8 for as the variable naming conventions -- you could use `black` and `pylint` for checking.

* For the `preprocess.py` and `layoutmodel.py` file
  
  1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
  2. And more fundamentally, why the `preprocess` function should live within `layoutparser` rather than `paddledetection`? I think it's `paddledetection`'s obligation to do the preprocessing for the models.

* Also could you add tests for the paddle modules? You could follow the examples in [`test_model`](https://github.com/Layout-Parser/layout-parser/blob/master/tests/test_model.py).

Thanks for your replay, I'll modify it later！

littletomatodonkey · 2021-08-07T03:02:05Z

Thanks for this PR. Here are some general questions/thoughts:

Please remove the OCR part from this PR.

Please use PEP8 for as the variable naming conventions -- you could use black and pylint for checking.

For the preprocess.py and layoutmodel.py file

I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?

And more fundamentally, why the preprocess function should live within layoutparser rather than paddledetection? I think it's paddledetection's obligation to do the preprocessing for the models.

Also could you add tests for the paddle modules? You could follow the examples in test_model.

Hi, For the preprocess.py and layoutmodel.py file,

we decouple the preprocess and inference engine because that will be more clear for the code.
we extract the preprocess code to reduce the dependence to paddledetection, or else the users might download the whole repo, which is too heavy.

an1018 · 2021-08-08T07:05:36Z

Hi，I've modied the code, including:

remove the OCR part
use pylint for checking
For the init.py and catalog.py file，these two files are less modified, so they are not checked
add tests for the paddle modules test_model

lolipopshock · 2021-08-06T14:37:00Z

setup.py

@@ -33,6 +33,15 @@
          'google-cloud-vision==1',
          'pytesseract'
        ], 
+        "paddleocr": [


please don't change anything here - I'll modify it later myself.

lolipopshock · 2021-08-06T14:37:27Z

src/layoutparser/__init__.py

@@ -11,11 +11,13 @@

 from .ocr import (
    GCVFeatureType, GCVAgent, 
-    TesseractFeatureType, TesseractAgent
+    TesseractFeatureType, TesseractAgent,
+    PaddleocrAgent


Please don't add OCR Agent in this PR -- right now we focus on LayoutModel.

lolipopshock · 2021-08-06T14:37:54Z

src/layoutparser/models/detectron2/catalog.py

@@ -86,7 +86,6 @@
        0: "Table"
    },
 }
-# fmt: on


why removed?

Sorry, I've modified it back

lolipopshock · 2021-08-06T14:38:27Z

src/layoutparser/models/paddleDetection/catalog.py

+}
+
+# fmt: off
+LABEL_MAP_CATALOG = {


Please remove the labeling mappings for unused datasets.

lolipopshock · 2021-08-06T14:39:12Z

src/layoutparser/models/paddleDetection/catalog.py

+# fmt: on
+
+
+class LayoutParserDetectron2ModelHandler(PathHandler):


Change class name to LayoutParserPaddleModelHandler.

lolipopshock · 2021-08-06T15:07:55Z

src/layoutparser/models/paddleDetection/layoutmodel.py

+                 trt_min_shape=1,
+                 trt_max_shape=1280,
+                 trt_opt_shape=640,
+                 min_subgraph_size=3):


And are there any documentations on how to set and adjust these parameters, in English? If not, I would suggest removing them..

lolipopshock · 2021-08-06T15:09:11Z

src/layoutparser/models/paddleDetection/layoutmodel.py

+                    enable_mkldnn=True, 
+                    thread_num=10,
+                    min_subgraph_size=3,
+                    use_dynamic_shape=False,


And apparently some of the hyperparameters/configurations are not used. Please remove.

lolipopshock · 2021-08-06T15:11:20Z

src/layoutparser/models/paddleDetection/layoutmodel.py

+        inputs = self.create_inputs(im, im_info)
+        return inputs
+
+    def postprocess(self, np_boxes, np_masks, inputs):


I think we need merge the postprocess with gather_output. It's not reasonable to break them into two functions.

I've merged ,thks

lolipopshock · 2021-08-06T15:15:34Z

src/layoutparser/models/paddleDetection/layoutmodel.py

+
+        if not enforce_cpu:
+            # initial GPU memory(M), device ID
+            config.enable_use_gpu(2000, 0)


Why it is 2000 rather than other values? Please leave a note in the comment. Thanks.

lolipopshock · 2021-08-08T16:32:02Z

setup.py

@@ -33,6 +33,12 @@
          'google-cloud-vision==1',
          'pytesseract'
        ], 
+        "paddlepaddle": [


Sorry for not being clear earlier. Please remove all the new extras_require, thanks.

lolipopshock · 2021-08-11T02:16:52Z

src/layoutparser/models/base_catalog.py

@@ -10,7 +10,7 @@ class DropboxHandler(HTTPURLHandler):
    """

    def _get_supported_prefixes(self):
-        return ["https://www.dropbox.com"]
+        return ["https://www.dropbox.com","https://paddle-model-ecology.bj.bcebos.com"]


I don't think this is a good idea -- perhaps it's better to create something like class PaddleModelURLHandler(HTTPURLHandler): in the paddle model catalog.py file.

Please see the comments below for more detailed instructions.

lolipopshock · 2021-08-11T02:28:41Z

src/layoutparser/models/paddledetection/layoutmodel.py

+        layout = self.gather_output(np_boxes, np_masks)
+        return layout
+
+    def untar_files(self, model_tar, model_dir):


It feels counter-intuitive to put this function inside the modeling folder, and I think it's better to include this part inside the PathManager, by rewriting this function _get_local_path. So basically it will be something like this:

class PaddleModelURLHandler(HTTPURLHandler): """ Supports download and file check for dropbox links """ def _get_supported_prefixes(self): return ["https://paddle-model-ecology.bj.bcebos.com"] def _isfile(self, path): return path in self.cache_map def _get_local_path( self, path: str, force: bool = False, cache_dir: Optional[str] = None, **kwargs: Any, ) -> str: """ This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested. """ self._check_kwargs(kwargs) if ( force or path not in self.cache_map or not os.path.exists(self.cache_map[path]) ): logger = logging.getLogger(__name__) parsed_url = urlparse(path) dirname = os.path.join( get_cache_dir(cache_dir), os.path.dirname(parsed_url.path.lstrip("/")) ) filename = path.split("/")[-1] if len(filename) > self.MAX_FILENAME_LEN: filename = filename[:100] + "_" + uuid.uuid4().hex cached = os.path.join(dirname, filename) with file_lock(cached): if not os.path.isfile(cached): logger.info("Downloading {} ...".format(path)) cached = download(path, dirname, filename=filename) #### ----> Add from here if path.endswith(".tar"): untar_function() logger.info("URL {} cached in {}".format(path, cached)) self.cache_map[path] = cached return self.cache_map[path]

You might also need to check whether the untar'ed version exists beforehand to avoid duplicated downloading.

lolipopshock · 2021-08-11T02:30:23Z

src/layoutparser/models/paddledetection/layoutmodel.py

+        image, im_info = permute(image, im_info)
+
+        inputs = {}
+        inputs['image'] = np.array((image, )).astype('float32')


This feels weird -- probably it's better to have np.array(image)[np.newaxis, :].astype('float32').

lolipopshock · 2021-08-11T02:32:00Z

src/layoutparser/models/paddledetection/layoutmodel.py

+            config of model, defined by `Config(model_dir)`
+        model_path (str):
+            The path to the saved weights of the model.
+        threshold (float):


Don't forget to clean up the docstrings -- and you can specify what the extra configs are under the extra_configs block.

src/layoutparser/models/paddledetection/layoutmodel.py

lolipopshock · 2021-08-11T02:40:30Z

src/layoutparser/models/paddledetection/layoutmodel.py

+            config_path = self._reconstruct_path_with_detector_name(config_path)
+            model_tar = PathManager.get_local_path(config_path)
+
+            pre_dir = os.path.dirname(model_tar)


please remove this -- see the details below.

lolipopshock · 2021-08-11T02:41:33Z

src/layoutparser/models/paddledetection/layoutmodel.py

+            thread_num=extra_config.get('thread_num',10))
+
+        self.threshold = extra_config.get('threshold',0.5)
+        self.input_shape = extra_config.get('input_shape',[3,640,640])


As input_shape is in self.im_info, you might want to remove this variable.

lolipopshock · 2021-08-11T02:42:40Z

src/layoutparser/models/paddledetection/layoutmodel.py

+        self.threshold = extra_config.get('threshold',0.5)
+        self.input_shape = extra_config.get('input_shape',[3,640,640])
+        self.label_map = label_map
+        self.im_info = {


So basically this is something like default_image_info, right? It's better to use a more descriptive name rather than im_info, which is also the outputs of the preprocessors.

lolipopshock · 2021-08-11T02:51:52Z

src/layoutparser/models/paddledetection/preprocess.py

+        im_info (dict): info of processed image
+    """
+    image = image.astype(np.float32, copy=False)
+    mean = np.array(mean)[np.newaxis, np.newaxis, :]


Why initializing the arrays this every time? We can just initialize the mean and std to be the appropriate ndarrays.

lolipopshock · 2021-08-11T02:55:28Z

src/layoutparser/models/paddledetection/layoutmodel.py

+            inputs (dict): input of model
+        """
+        # read rgb image
+        image, im_info = decode_image(image, self.im_info)


I don't like the design here:

in the current setting, the self.im_info will be changed for all different image inputs, which doesn't make sense.

im_info is only changed in the resize function, and the decode_image doesn't change image at all - can we further simplify the APIs?

For example, an alternative example could be:

image = image.copy() input_shape = np.array(image.shape[:2], dtype=np.float32) image, scale_factor = resize(image) # change the return value image = (image - self.extra_config['pixel_mean']) \ / image - self.extra_config['pixel_std'] # may need change how to load the values image = image.transpose((2, 0, 1)).copy() # The model requires channel-first input model_input_image_shape = np.array(image.shape[:2], dtype=np.float32) image_info = { 'scale_factor': scale_factor, 'im_shape': model_input_image_shape, 'input_shape': input_shape, }

lolipopshock · 2021-08-11T02:57:59Z

Thanks for the updates! The current version is better than the previous one, but I think the code could be further simplified and more structured -- please check the comments for possible updates.

lolipopshock · 2021-08-17T05:51:10Z

I've also carefully checked the code, and added some changes. Some key updates:

improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
further simplify the layoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifying batch_size during inference, however that variable was never used.

And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal.

Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

littletomatodonkey · 2021-08-17T06:09:21Z

I've also carefully checked the code, and added some changes. Some key updates:

improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.

further simplify the layoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifying batch_size during inference, however that variable was never used.

And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~~2MB/s in US, taking around 2~~3 minutes to download the whole model, which might be suboptimal.

Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

Hi, thanks for your new changes, which further simplifies the code.

For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
The changes are ok for me, thanks for your careful review again!

an1018 · 2021-08-17T06:51:31Z

The changes are ok for me，too.

Through this repo and the whole modification process, I also learned a lot. Thank you very much！

lolipopshock · 2021-08-17T16:32:56Z

I've also carefully checked the code, and added some changes. Some key updates:

improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.

further simplify the layoutmodel.py, removing unnecessary variables and definitions.

Please note, you mentioned in the previous version that paddle mode may support specifying batch_size during inference, however that variable was never used.
And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~~2MB/s in US, taking around 2~~3 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!

Hi, thanks for your new changes, which further simplifies the code.

For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.

Batch size is not used in the code, which is just designed for the expansion, thanks for your check.

The changes are ok for me, thanks for your careful review again!

I think ultimately we'll provide better downloading support, e.g., mirrored downloading sites and or something mentioned in #43 . At that time, ppl from different regions can choose the best downloading methods accordingly.

an1018 added 2 commits August 5, 2021 13:59

add paddle

f7222ed

add paddle model

183b3be

littletomatodonkey reviewed Aug 6, 2021

View reviewed changes

add paddle model

a101930

lolipopshock changed the title ~~reorganize paddle structure~~ Add PaddleDetection-based Layout Model Aug 6, 2021

an1018 added 2 commits August 8, 2021 13:42

Add paddle model

202f938

Add paddle model

109b20d

lolipopshock requested changes Aug 9, 2021

View reviewed changes

an1018 added 3 commits August 9, 2021 20:38

Add paddle model

90f8fef

Add paddle model

577ff84

Add paddle model

a976ede

lolipopshock reviewed Aug 11, 2021

View reviewed changes

src/layoutparser/models/paddledetection/layoutmodel.py Show resolved Hide resolved

lolipopshock reviewed Aug 11, 2021

View reviewed changes

an1018 and others added 4 commits August 12, 2021 14:43

Add paddle model

5932e6f

Add paddle model

80aded6

Better model downloading logic

76d4096

Use layout parser PathManager

2fabc52

simplify the layoutmodel in paddledetection

cf75a68

lolipopshock added 2 commits August 17, 2021 01:53

remove the empty preprocess.py file

0bad164

incldue paddle models in dev-requirements

1f83199

lolipopshock approved these changes Aug 17, 2021

View reviewed changes

lolipopshock merged commit 035f66a into Layout-Parser:master Aug 17, 2021

lolipopshock mentioned this pull request Aug 30, 2021

Can we use Yolov5 or other model to do DLA? #40

Closed

		# fmt: on


		class LayoutParserDetectron2ModelHandler(PathHandler):

Add PaddleDetection-based Layout Model #54

Add PaddleDetection-based Layout Model #54

Conversation

an1018 commented Aug 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock commented Aug 6, 2021

an1018 commented Aug 7, 2021

littletomatodonkey commented Aug 7, 2021

an1018 commented Aug 8, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock Aug 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

lolipopshock Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

lolipopshock Aug 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lolipopshock commented Aug 11, 2021 • edited Loading

lolipopshock commented Aug 17, 2021

littletomatodonkey commented Aug 17, 2021

an1018 commented Aug 17, 2021 • edited Loading

lolipopshock commented Aug 17, 2021

lolipopshock Aug 6, 2021 •

edited

Loading

lolipopshock Aug 11, 2021 •

edited

Loading

lolipopshock Aug 11, 2021 •

edited

Loading

lolipopshock Aug 11, 2021 •

edited

Loading

lolipopshock Aug 11, 2021 •

edited

Loading

lolipopshock Aug 11, 2021 •

edited

Loading

lolipopshock commented Aug 11, 2021 •

edited

Loading

an1018 commented Aug 17, 2021 •

edited

Loading