Numerical Methods in Deep Learning and Computer Vision (2024)

Numerical methods, the collective name for numerical analysis and optimization techniques, have been widely used in the field of computer vision and deep learning. In this thesis, we investigate the algorithms of some numerical methods and their relevant applications in deep learning. These studied numerical techniques mainly include differentiable matrix power functions, differentiable eigendecomposition (ED), feasible orthogonal matrix constraints in optimization and latent semantics discovery, and physics-informed techniques for solving partial differential equations in disentangled and equivariant representation learning. We first propose two numerical solvers for the faster computation of matrix square root and its inverse. The proposed algorithms are demonstrated to have considerable speedup in practical computer vision tasks. Then we turn to resolve the main issues when integrating differentiable ED into deep learning -- backpropagation instability, slow decomposition for batched matrices, and ill-conditioned input throughout the training. Some approximation techniques are first leveraged to closely approximate the backward gradients while avoiding gradient explosion, which resolves the issue of backpropagation instability. To improve the computational efficiency of ED, we propose an efficient ED solver dedicated to small and medium batched matrices that are frequently encountered as input in deep learning. Some orthogonality techniques are also proposed to improve input conditioning. All of these techniques combine to mitigate the difficulty of applying differentiable ED in deep learning. In the last part of the thesis, we rethink some key concepts in disentangled representation learning. We first investigate the relation between disentanglement and orthogonality -- the generative models are enforced with different proposed orthogonality to show that the disentanglement performance is indeed improved. We also challenge the linear assumption of the latent traversal paths and propose to model the traversal process as dynamic spatiotemporal flows on the potential landscapes. Finally, we build probabilistic generative models of sequences that allow for novel understandings of equivariance and disentanglement. We expect our investigation could pave the way for more in-depth and impactful research at the intersection of numerical methods and deep learning.

Numerical Methods in Deep Learning and Computer Vision / Song, Yue. - (2024 Apr 23), pp. 1-156.

Numerical Methods in Deep Learning and Computer Vision

Song, Yue
2024-04-23

Abstract

Numerical methods, the collective name for numerical analysis and optimization techniques, have been widely used in the field of computer vision and deep learning. In this thesis, we investigate the algorithms of some numerical methods and their relevant applications in deep learning. These studied numerical techniques mainly include differentiable matrix power functions, differentiable eigendecomposition (ED), feasible orthogonal matrix constraints in optimization and latent semantics discovery, and physics-informed techniques for solving partial differential equations in disentangled and equivariant representation learning. We first propose two numerical solvers for the faster computation of matrix square root and its inverse. The proposed algorithms are demonstrated to have considerable speedup in practical computer vision tasks. Then we turn to resolve the main issues when integrating differentiable ED into deep learning -- backpropagation instability, slow decomposition for batched matrices, and ill-conditioned input throughout the training. Some approximation techniques are first leveraged to closely approximate the backward gradients while avoiding gradient explosion, which resolves the issue of backpropagation instability. To improve the computational efficiency of ED, we propose an efficient ED solver dedicated to small and medium batched matrices that are frequently encountered as input in deep learning. Some orthogonality techniques are also proposed to improve input conditioning. All of these techniques combine to mitigate the difficulty of applying differentiable ED in deep learning. In the last part of the thesis, we rethink some key concepts in disentangled representation learning. We first investigate the relation between disentanglement and orthogonality -- the generative models are enforced with different proposed orthogonality to show that the disentanglement performance is indeed improved. We also challenge the linear assumption of the latent traversal paths and propose to model the traversal process as dynamic spatiotemporal flows on the potential landscapes. Finally, we build probabilistic generative models of sequences that allow for novel understandings of equivariance and disentanglement. We expect our investigation could pave the way for more in-depth and impactful research at the intersection of numerical methods and deep learning.

  • Scheda breve
  • Scheda completa
  • Scheda completa (DC)

XXXVI

2023-2024

Ingegneria e scienza dell'Informaz (29/10/12-)

Information and Communication Technology

Sebe, Niculae

no

Inglese

Settore INF/01 - Informatica

08.1 Tesi di dottorato (Doctoral Thesis)

{% } %}

{%# o.licenseName %}

File in questo prodotto:

FileDimensioneFormato
2024_Thesis_Yue.pdf

accesso aperto

Tipologia:Tesi di dottorato (Doctoral Thesis)

Licenza:Tutti i diritti riservati (All rights reserved)

Dimensione61.2 MB

FormatoAdobe PDF

Visualizza/Apri

61.2 MBAdobe PDFVisualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/406633

Citazioni
  • ND

');$.ajax({url: '/itemExternalCitation/pmc/get.json',dataType: 'json',data: { discriminator: 'pmc', itemId: '389f0e4a-5bb4-4d75-b832-6db03058f4f6', forceUpdate: forceUpdate }}).done(function(adata) {$('#pmcCitedResultTotal').tooltip('dispose');$('#pmcCitedResultTotal').prop("onclick", null).off("click");if (adata.total==null){$('#pmcCitedResultTotal').html('ND');} else {$('#pmcCitedResultTotal').html(adata.total);}var year=new Date().getFullYear();pmcChartData[0] = (adata.yearTotalMap[''+(year-5)]!=null ? adata.yearTotalMap[''+(year-5)] : 0);pmcChartData[1] = (adata.yearTotalMap[''+(year-4)]!=null ? adata.yearTotalMap[''+(year-4)] : 0);pmcChartData[2] = (adata.yearTotalMap[''+(year-3)]!=null ? adata.yearTotalMap[''+(year-3)] : 0);pmcChartData[3] = (adata.yearTotalMap[''+(year-2)]!=null ? adata.yearTotalMap[''+(year-2)] : 0);pmcChartData[4] = (adata.yearTotalMap[''+(year-1)]!=null ? adata.yearTotalMap[''+(year-1)] : 0);pmcChartData[5] = (adata.yearTotalMap[''+year]!=null ? adata.yearTotalMap[''+year] : 0);drawChart();}).always(function(adata){//console.log('end pmcUpdateCitation');runningExternal=false;});}

  • ND
  • ');$.ajax({url: '/itemExternalCitation/scopus/get.json',dataType: 'json',data: { discriminator: 'scopus', itemId: '389f0e4a-5bb4-4d75-b832-6db03058f4f6', forceUpdate: forceUpdate }}).done(function(adata) {$('#scopusCitedResultTotal').tooltip('dispose');$('#scopusCitedResultTotal').prop("onclick", null).off("click");if (adata.total==null){$('#scopusCitedResultTotal').html('ND');} else {$('#scopusCitedResultTotal').html(adata.total);}var year=new Date().getFullYear();scopusChartData[0] = (adata.yearTotalMap[''+(year-5)]!=null ? adata.yearTotalMap[''+(year-5)] : 0);scopusChartData[1] = (adata.yearTotalMap[''+(year-4)]!=null ? adata.yearTotalMap[''+(year-4)] : 0);scopusChartData[2] = (adata.yearTotalMap[''+(year-3)]!=null ? adata.yearTotalMap[''+(year-3)] : 0);scopusChartData[3] = (adata.yearTotalMap[''+(year-2)]!=null ? adata.yearTotalMap[''+(year-2)] : 0);scopusChartData[4] = (adata.yearTotalMap[''+(year-1)]!=null ? adata.yearTotalMap[''+(year-1)] : 0);scopusChartData[5] = (adata.yearTotalMap[''+year]!=null ? adata.yearTotalMap[''+year] : 0);drawChart();}).always(function(adata){//console.log('end scopusUpdateCitation');runningExternal=false;});}
  • ND
  • Numerical Methods in Deep Learning and Computer Vision (2024)

    FAQs

    Numerical Methods in Deep Learning and Computer Vision? ›

    Numerical analysis is fundamental to data science and data analysis. It is the study of methods and algorithms that render numerical solutions, using computing machines, to mathematical problems.

    Is numerical methods useful for computer science? ›

    Numerical analysis is fundamental to data science and data analysis. It is the study of methods and algorithms that render numerical solutions, using computing machines, to mathematical problems.

    What is numerical computational method? ›

    Numerical computing is an approach for solving complex mathematical problems using only simple arithmetic operations [1]. The approach involves formulation of mathematical models physical situations that can be solved with arithmetic operations [2]. It requires development, analysis and use of algorithms.

    How is deep learning used in computer vision? ›

    Some examples include: Object Detection: Deep learning algorithms enable precise identification and localization of objects within images, using CNNs to propose regions of interest in an image and classify and refine these regions to accurately detect objects of interest, along with their respective bounding boxes.

    Are numerical methods used in machine learning? ›

    Numerical methods play a critical role in machine learning, deep learning, artificial intelligence, and data science. These methods are essential for solving complex mathematical problems that are common in these fields.

    What are the real life applications of numerical methods? ›

    Engineering: Numerical solutions to equations are used in engineering to model and simulate complex physical systems, such as fluid dynamics, structural mechanics, and control systems. For example, numerical solutions to the Navier-Stokes equations are used to model the flow of fluids around aircraft and automobiles.

    Why do engineers need to study numerical methods? ›

    Mastering Numerical methods is an important skill for engineers or scientists as most engineering problem involve the development of a mathematical model to represent the important characteristics of the physical system.

    What are the 4 computational methods? ›

    There are four key techniques (cornerstones) to computational thinking:
    • decomposition. - breaking down a complex problem or system into smaller, more manageable parts.
    • pattern recognition. – looking for similarities among and within problems.
    • abstraction. ...
    • algorithms.

    What are numerical methods in computer graphics? ›

    The mathematical topics that are often the most useful to graphics are so-called Numerical Methods. These are the tools that take abstract mathematical concepts (differentiation, integration, matrix inversion, etc.) and turn them into concrete algorithms that we can use to find numerical results to the problem at hand.

    What are the roles of numerical methods in computing? ›

    Error Analysis and Stability: Numerical methods help in analyzing the errors introduced during computations and ensure the stability of algorithms. Engineers use techniques like Richardson extrapolation and error propagation analysis to quantify and minimize errors, ensuring the reliability of computational results.

    Should I learn deep learning before computer vision? ›

    Well I personally think it is best to start with machine learning (ML) and then move to computer vision (CV). Though there is a lot of overlap between them because as you learn ML you will come across CV problems like MNIST, CIFAR-10, CIFAR-100 or ImageNet datasets which are simultaneously under CV and ML.

    What is a common application of deep learning in computer vision? ›

    Deep Learning and Computer Vision

    This technology is essential for applications including autonomous driving and robotics, wherein the vehicles or robots must be able to detect and respond to different objects in their environment — or in healthcare to analyze medical images.

    Is computer vision AI or ML? ›

    Computer vision applications use artificial intelligence and machine learning (AI/ML) to process this data accurately for object identification and facial recognition, as well as classification, recommendation, monitoring, and detection.

    What is the most popular numerical method? ›

    1) Finite Element Method (FEM) :

    FEM is the most popular numerical method. Applications - Linear, Nonlinear, Buckling, Thermal, Dynamic and Fatigue analysis.

    Which numerical method is best? ›

    Simpson's rule is the most accurate method and the fastest convergent. The easiest way to see this is the more complicated function, where none of the rules find precise solution. For example for f(x) = x3 or f(x) = sin(x) we can see, that the error for n = 1 is big, but it rapidly deacreases for n = 4 and n = 10.

    What are examples of numerical methods? ›

    Methods such as finite difference method (FDM), finite volume method (FVM), finite element method (FEM), boundary element method (BEM) etc are commonly used for treating PDE numerically. All numerical methods used to solve PDEs should have consistency, stability and convergence.

    Do you need maths methods for computer science? ›

    Computer science operates on the language of math. That means earning your bachelor's degree in computer science will likely require taking several math courses. Of course, the number and kinds of classes will depend on your program.

    How useful is numerical methods? ›

    Numerical methods provide a way to solve problems quickly and easily compared to analytic solutions. Whether the goal is integration or solution of complex differential equations, there are many tools available to reduce the solution of what can be sometimes quite difficult analytical math to simple algebra.

    What are the advantages of studying numerical methods? ›

    Numerical techniques in Ordinary Differential Equations (ODEs) offer several advantages. They provide high accuracy and fast convergence speed, making them efficient for solving complex engineering problems.

    Which programming language is best for numerical analysis? ›

    MATLAB is a widely used proprietary software for performing numerical computations. It comes with its own programming language, in which numerical algorithms can be implemented.

    Top Articles
    Latest Posts
    Article information

    Author: Tish Haag

    Last Updated:

    Views: 6153

    Rating: 4.7 / 5 (47 voted)

    Reviews: 94% of readers found this page helpful

    Author information

    Name: Tish Haag

    Birthday: 1999-11-18

    Address: 30256 Tara Expressway, Kutchburgh, VT 92892-0078

    Phone: +4215847628708

    Job: Internal Consulting Engineer

    Hobby: Roller skating, Roller skating, Kayaking, Flying, Graffiti, Ghost hunting, scrapbook

    Introduction: My name is Tish Haag, I am a excited, delightful, curious, beautiful, agreeable, enchanting, fancy person who loves writing and wants to share my knowledge and understanding with you.