Ablation study for each component. MS indicates DHTs with multi-scale features as described in~\cref{sec:ms-dht-fpn}, and CTX means context-aware aggregation as described in~\cref{sec:ctx-line-detector}. }\vspace{-6pt} % \resizebox{0.8\textwidth}{!}{ \begin{tabular}{C|C|C|C} \toprule DHT & MS & CTX & F-measure \\ % HED+HT & DHT+RHT & MS & CTX & mean \emph{F-measure} \\ \hline % \checkmark & & & 0.846 \\ % \checkmark & & & & 0.829 \\ \checkmark & & & 0.664 \\ \checkmark & \checkmark & & 0.758 \\ \checkmark & & \checkmark & 0.771 \\ % \checkmark & \checkmark & & 0.852 \\ \checkmark & \checkmark & \checkmark & 0.786 \\ \bottomrule %-----------------------------------------------% \end{tabular} \label{tab:ablation} \end{table} } \subsubsection{Edge-guided Refinement} \label{sec:ablation-refinement} Here we ablate the ``Edge-guided Refinement'' module \revise{(abbreviated as ER)}. % First, we test the performance of DHT+ER using different $\delta_r$. % The $\delta_r$ parameter controls the size of the searching space in ER ($\mathcal{L}$ in ~\cref{eq:refine-search}). % This experiment is conducted on the SEL dataset using the ResNet50 backbone. \CheckRmv{ \begin{table}[!htb] \renewcommand{\arraystretch}{1.3} \newcolumntype{C}{>{\centering\arraybackslash}p{0.08\textwidth}} \centering \caption{ Performance DHT+ER with different $\delta_r$. % Models are trained/tested on the SEL dataset using the Resnet50 backbone. % $\delta_r=0$ represents with vanilla DHT method without ER. }\vspace{-6pt} \newcommand{\CC}{\cellcolor{gray!20}} \begin{tabular}{C|C|C|C} \toprule $\delta_r$ & Precision & Recall & F-measure\\ \hline \CC 0 & \CC 0.8190 & \CC 0.7530 & \CC 0.7861 \\ 1 & 0.8199 & 0.7561 & 0.7866 \\ 3 & 0.8208 & 0.7569 & 0.7874 \\ 5 & 0.8214 & 0.7574 & 0.7880 \\ 7 & 0.8213 & 0.7573 & 0.7878 \\ 9 & 0.8212 & 0.7571 & 0.7877 \\ \bottomrule %-----------------------------------------------% \end{tabular} \label{tab:ablation-refinement-1} \end{table} } Results in ~\cref{tab:ablation-refinement-1} tells that the performance first increases and then gets saturated with the growth of $\delta_r$. % Since the peak performance occurs when $\delta_r = 5$, % we set $\delta_r=5$ for better performance. % After setting $\delta_r$ to 5, we compare the performance of our method with and without ER, using different backbones \revise{and} datasets. \CheckRmv{ \begin{table}[!htb] \renewcommand{\arraystretch}{1.3} \centering \caption{ Performance with and without ER ($\delta_r=5$) using different backbones \revise{and} datasets. }\vspace{-6pt} \newcommand{\CC}{\cellcolor{gray!20}} \begin{tabular}{l|c|c|c|c|c|c} \toprule Dataset & Arch & Edge & P & R & F & F@0.95\\ \hline \multirow{4}{*}{SEL~\cite{lee2017semantic}} & VGG16 & & 0.756 & 0.774 & 0.765 & 0.380\\ & \CC VGG16 & \CC \checkmark & \CC 0.758 & \CC 0.777 & \CC 0.770 & \CC 0.439 \\ & Resnet50 & & 0.819 & 0.753 & 0.786 & 0.420\\ & \CC Resnet50 & \CC \checkmark & \CC 0.821 & \CC 0.757 & \CC 0.788 & \CC 0.461\\ \hline \multirow{4}{*}{\revise{NKL}} & VGG16 & & \revise{0.659} & \revise{0.759} & \revise{0.706} & \revise{0.434}\\ & \CC VGG16 & \CC\checkmark & \CC \revise{0.664} & \CC \revise{0.765} & \CC \revise{0.711} & \CC \revise{0.472}\\ & Resnet50 & & \revise{0.679} & \revise{0.766} & \revise{0.719} & \revise{0.459}\\ & \CC Resnet50 & \CC \checkmark & \CC \revise{0.684} & \CC \revise{0.771} & \CC \revise{0.725} & \CC \revise{0.486}\\ \bottomrule %-----------------------------------------------% \end{tabular} \label{tab:ablation-refinement-2} \end{table} } Results in ~\cref{tab:ablation-refinement-2} clearly demonstrate that edge-guided refinement can effectively improve detection results regardless of backbone architectures and datasets. %-----------------------------------------------------------------------------------% \section{Conclusions}\label{sec:conclusion} %-----------------------------------------------------------------------------------% In this paper, we proposed a simple yet effective method for semantic line detection in In this paper, we proposed a simple yet effective method for semantic line detection in natural scenes. % By incorporating the strong learning ability of CNNs into classical Hough transform, our method is able to capture complex textures and rich contextual semantics of lines. % To better assess the similarity between a pair of lines, we designed a new evaluation metric considering both Euclidean distance and angular distance between lines. % Besides, a new dataset for semantic line detection was constructed to fulfill the gap between the scale of existing datasets and the complexity of modern CNN models. % Both quantitative and qualitative results revealed that our method significantly outperforms previous arts in terms of both detection quality and speed.

Acknowledgment: This research was supported by the Major Project for New Generation of AI under Grant No. 2018AAA0100400, NSFC (61922046,61620106008,62002176), S\&T innovation project from Chinese Ministry of Education, and Tianjin Natural Science Foundation (17JCJQJC43700). 