A well-known shape representation usually applied for 3D object recognition is the Extended Gaussian Image (EGI) which maps the histogram of the orientations of the object surface on the unitary sphere. We propose to adopt an analogous “abstract” data-structure named Protein Gaussian Image (PNM) for representing the orientation of the protein secondary structures (e.g. helices or strands) which combines the characteristics of the EGI and the ones of needle maps. The “concrete” data structures is the same as for the EGI, with a hierarchy that starting with a discretization corresponding to the 20 orientations of the icosahedron facets, it is iteratively refined with a factor 4 at each new level (80, 320, 1280, . . . ) up to the maximum precision required. However, in this case to each orientation does not correspond the area of the patches having that orientation but the features of the protein secondary structures having that direction. Among the features we may include the versus (origin versus surface or vice versa), the length of the structure (e.g. the number of amino acids), biochemical properties, and even the sequence of the amino acids (stored as a list). We consider this representation very effective for a preliminary screening when looking in a protein data base for retrieval of a given structural block, or a domain, or even an entire protein. In fact, on this structure it is possible to identify the presence of a given motif, or also sheets (note that parallel or anti-parallel β-sheets are characterized by common or opposite directions of ladders). Herewith some known proteins are described with common typical motifs easily marked in the PGI.

Protein Gaussian Image (PGI)

CANTONI, VIRGINIO;
2012-01-01

Abstract

A well-known shape representation usually applied for 3D object recognition is the Extended Gaussian Image (EGI) which maps the histogram of the orientations of the object surface on the unitary sphere. We propose to adopt an analogous “abstract” data-structure named Protein Gaussian Image (PNM) for representing the orientation of the protein secondary structures (e.g. helices or strands) which combines the characteristics of the EGI and the ones of needle maps. The “concrete” data structures is the same as for the EGI, with a hierarchy that starting with a discretization corresponding to the 20 orientations of the icosahedron facets, it is iteratively refined with a factor 4 at each new level (80, 320, 1280, . . . ) up to the maximum precision required. However, in this case to each orientation does not correspond the area of the patches having that orientation but the features of the protein secondary structures having that direction. Among the features we may include the versus (origin versus surface or vice versa), the length of the structure (e.g. the number of amino acids), biochemical properties, and even the sequence of the amino acids (stored as a list). We consider this representation very effective for a preliminary screening when looking in a protein data base for retrieval of a given structural block, or a domain, or even an entire protein. In fact, on this structure it is possible to identify the presence of a given motif, or also sheets (note that parallel or anti-parallel β-sheets are characterized by common or opposite directions of ladders). Herewith some known proteins are described with common typical motifs easily marked in the PGI.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/465121
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact