Showing posts with label compiler-construction. Show all posts

Monday, September 26, 2022

[FIXED] What are the reasons for not hosting a compiler on a live server?

September 26, 2022 binaries, compiler-construction, deployment, pip, python No comments

Issue

Where I currently work we've had a small debate about deploying our Python code to the production servers. I voted to build binary dependencies (like the python mysql drivers) on the server itself, just using pip install -r requirements.txt. This was quickly vetoed with no better explanation that "we don't put compilers on the live servers". As a result our deployment process is becoming convoluted and over-engineered simply to avoid this compilation step.

My question is this: What's the reason these days to avoid having a compiler on live servers?

Solution

In general, the prevailing wisdom on servers installs is that they should be as stripped-down as possible. There are a few motivations for this, but they don't really apply all that directly to your question about a compiler:

Minimize resource usage. GCC might take up a little extra disk space, but probably not enough to matter - and it won't be running most of the time, so CPU/memory usage isn't a big concern.
Minimize complexity. Building on your server might add a few more failure modes to your build process (if you build elsewhere, then at least you will notice something wrong before you go mess with your production server), but otherwise, it won't get in the way.
Minimize attack surface. As others have pointed out, by the time an attacker can make use of a compiler, you're probably already screwed..

At my company, we generally don't care too much if compilers are installed on our servers, but we also never run pip on our servers, for a rather different reason. We're not so concerned about where packages are built, but when and how they are downloaded.

The particularly paranoid among us will take note that pip (and easy_install) will happily install packages from PYPI without any form of authentication (no SSL, no package signatures, ...). Further, many of these aren't actually hosted on PYPI; pip and easy_install follow redirects. So, there are two problems here:

If pypi - or any of the other sites on which your dependencies are hosted - goes down, then your build process will fail
If an attacker somehow manages to perform a man-in-the-middle attack against your server as it's attempting to download a dependency package, then he'll be able to insert malicious code into the download

So, we download packages when we first add a dependency, do our best to make sure the source is genuine (this is not foolproof), and add them into our own version-control system. We do actually build our packages on a separate build server, but this is less crucial; we simply find it useful to have a binary package we can quickly deploy to multiple instances.

Answered By - akgood

Answer Checked By - Dawn Plyler (PHPFixing Volunteer)

[FIXED] How does casting from integers to floating-point numbers work?

July 20, 2022 casting, compiler-construction, floating-point, integer No comments

Issue

Assuming 32-bit values (int32_t, float), they are stored in memory as follows:

// 255
int:   11111111 00000000 00000000 00000000 (big endian)
int:   00000000 00000000 00000000 11111111 (little endian)
float: 0 11111111 000000000000000000000

By this point it's fairly obvious that the memory itself is arranged differently, depending on the interpreted type.

Further assuming a standard C-style cast, how is this achieved? I usually work with x86(_64) and ARMHF CPUs, but I'm not familiar with their respective assembly languages or the way the CPUs are organised internally, so please excuse if this would be answered fairly simply by knowing the internals of these CPUs. Primarily of interest, are how C/++ and C# handle this cast.

Does the compiler generate instructions which interpret the sign-bit and the exponent portion and just converts them over to a memory structure representing an integer, or is there some magic going on in the background?
Do x86_64 and ARMHF have built-in instructions to handle this sort of thing?
Or: does a C-style cast simply copy the memory and it's up to the runtime to interpret whatever value pops out (seems unlikely, but I may be mistaken)?

The suggested posts Why are floating point numbers inaccurate? and Why can't decimal numbers be represented exactly in binary? do help with understanding basic concepts of floating-point math, but do not answer this question.

Solution

If int: 11111111 00000000 00000000 00000000 (big endian) is showing us the bytes in memory order (lowest address to highest address), then that is little endian, not big endian: The least significant bits of 255, 11111111, are in the low address, and the most significant bits, 00000000, are in the high address.

float: 0 11111111 000000000000000000000 is not the encoding of 255 in the format most commonly used for float, IEEE-754 binary32. The bits would be 0 10000110 11111110000000000000000 (437F0000₁₆, or, when stored little-endian, 00 00 7F 43). The exponent code of 134 represents an exponent of 134−127 = 7, and the significand field represents 1.1111111₂ = 1.9921875, so the entire value represented is +1.9921875•2⁷ = 255.

A compiler will generate whatever instructions it needs to work with values. Typically, processors with hardware support for floating-point have different instructions, and often different registers, for integer and floating-point values. To work with an int, the compiler will generate instructions to load it into a general register and integer-arithmetic instructions to operate on it. To work with a float, the compiler will generate instructions to load it into a floating-point instructions and floating-point-arithmetic instructions to operate on it.

If the hardware does not have hardware floating-point support, the compiler generates instructions to interpret and process the bits representing the float in ways necessary to produce the correct results. Much of this is done by calling routines from a library of software-floating-point routines. Inside those routines, the instructions break down the parts of a floating-point representation, do computations as necessary, and reassemble the parts to produce floating-point results.

x86_64 has built-in floating-point instructions,.

ARMHF has built-in floating-point instructions; the HF stands for Hardware Floating-point or Hard Float. (I do not have information that that is an official ARM designation; it may be colloquial.)

When you cast an int to float or vice-versa in C, the compiler uses a built-in instruction to perform the conversion (unless optimization provides another solution), if the hardware has such an instruction. The hardware instruction manipulates the bits of the representation to compute the result. If the hardware does not have an instruction for this, the compiler generates whatever instructions it needs, likely calling a routine from a library as above.

C implementations that support mixed big-endian and little-endian types are rare. However, if supported, the compiler would simply swap bytes as needed. Some hardware may assist with this with instructions that either swap bytes as words are loaded and stored or that swap bytes in registers.

Answered By - Eric Postpischil

Answer Checked By - Candace Johnson (PHPFixing Volunteer)

[FIXED] How do I turn specific Delphi warnings and hints off?

July 17, 2022 compiler-construction, delphi, delphi-2007, warnings No comments

Issue

In CodeGear Delphi 2007, how can I turn specific warnings and hints off? I am attempting to turn off H2077 - Value assigned to 'varname' never used.

Solution

Hints? No specific.

You'll have to disable them all:

{$HINTS OFF}

Warnings?

{$WARN _name_of_warning_ OFF|ON|ERROR}

Check here for a full list

Answered By - Lars Truijens

Answer Checked By - Pedro (PHPFixing Volunteer)

[FIXED] How to build dominance frontier for control flow graph?

June 28, 2022 compiler-construction, graph, ssa No comments

Issue

I would like to understand what a common principle is used for building Ф-functions for nodes. I read about "dominance frontier (DF)" relationship in a graph that allow to build Ф-functions. Here is an example with control-flow graph for simple code snippet: control-flow-graph

Let's consider the definition for DF:

DF is a set of nodes w such that x dominates predecessor of w, but x does not strictly dominate w

Okay, here is my understanding of this definition. Let's consider: DF(B1) = { B3, B5, B6, B7 } because:

dom(B1, B2) & !strictly_dom(B1, B3) & is_predecessor(B2, B3);
dom(B1, B3) & !strictly_dom(B1, B5) & is_predecessor(B3, B5);
dom(B1, B3) & !strictly_dom(B1, B6) & is_predecessor(B3, B6); 
dom(B1, B6) & !strictly_dom(B1, B7) & is_predecessor(B6, B7);

Is this right understanding of DF? Could you give me more detailed explanation, please?

Solution

The best way to understand the role of dominance frontiers in the context of constructing SSA form is to identify join points in your graph. This is the intuition people use when constructing SSA form on paper.

If a node in your control flow graph has less than 2 predecessors, then it won't be in the dominance frontier of any node - as it cannot be a merge point for competing definitions. The concept of a "dominance frontier" is precisely the nodes where the "dominance" of some node ends (i.e. where its definitions may no longer be the dominant ones; the ones that definitely reach those points, without competition from other alternate paths where other definitions may occur and, thus, reach the same points).

So, just by looking at your control flow graph, we can see that the only nodes with at least 2 predecessors are B2 and B7. So, these nodes will be the ones that form the frontiers of some nodes.

You must appeal to the definition of dominance to know whose frontier is composed of these nodes. By definition, every node dominates itself. So, if we look at the predecessors of the blocks B2 and B7, we can eliminate the ones that don't strictly dominate those nodes.

For B7, we have that B5 and B6 dominate a predecessor of B7 (namely, themselves). However, they do not strictly dominate B7 (in fact, they do not dominate B7 at all). The fact that they both compete for dominance over the definitions that reach B7's is precisely why the dominance frontiers of B5 and B6 are both {B7}. Intuitively, citing your original graph, you can see that these two predecessors both define j and k, so when control flow reaches B7, whose definitions do you use? You can't tell, since you could go through either to reach B7. So, you'd place phi nodes: j <- ϕ((j, B5), (j, B6)) and k <- ϕ((k, B5), (k, B6)). Keeping track of the source of each variable definition you're talking about is important to identify which (re)name you use when you perform renaming (you place phi nodes, then rename).

For B2, B7 is a predecessor and dominates itself by definition. However, it doesn't strictly dominate B2 as it competes with initial definitions from the entry block, B1. Therefore, you'd also place phi nodes to merge those definitions: j <- ϕ((j, B1), (j, B7)), k <- ϕ((k, B1), (k, B7)). Notice how there's no competing definition for i because i is never redefined. As far as I can tell, you could constant propagate i's value, 1, to its usage site and remove the definition.

I recommend reading "A Simple, Fast Dominance Algorithm" by Cooper et al. (https://www.cs.rice.edu/~keith/EMBED/dom.pdf), they give a simple algorithm that computes the dominator tree and also an intuitive algorithm (fig 5) for computing the frontiers by walking up the dominator tree.

You're best thinking about dominance frontiers intuitively; which competing definitions can reach a join point? In the suggested DF(B1) you give, you have that B3 ∈ DF(B1). This cannot be right. The reason being that the only definitions that can reach B3 are the ones that leave B2 (its single predecessor). However, the definitions that enter B2 are competitive because you could reach B2 by coming from B7 or B1 (hence, as described above, B2 has to have a leading phi node - the definition created by the phi node is the one that will end up reaching B3 without competition).

Answered By - contificate

Answer Checked By - Willingham (PHPFixing Volunteer)

[FIXED] How can I ignore GCC compiler 'pedantic' errors in external library headers?

June 26, 2022 compiler-construction, compiler-errors, gcc No comments

Issue

I recently added -pedantic and -pedantic-errors to my make GCC compile options to help clean up my cross-platform code. All was fine until it found errors in external-included header files. Is there a way to turn off this error checking in external header files, i.e.:

Keep checking for files included like this:

#include "myheader.h"

Stop checking for include files like this:

#include <externalheader.h>

Here are the errors I am getting:

g++ -Wall -Wextra -Wno-long-long -Wno-unused-parameter -pedantic --pedantic-errors
-O3 -D_FILE_OFFSET_BITS=64 -DMINGW -I"freetype/include" -I"jpeg" -I"lpng128" -I"zlib"
-I"mysql/include" -I"ffmpeg/libswscale" -I"ffmpeg/libavformat" -I"ffmpeg/libavcodec"
-I"ffmpeg/libavutil" -o omingwd/kguimovie.o -c kguimovie.cpp

In file included from ffmpeg/libavutil/avutil.h:41,
             from ffmpeg/libavcodec/avcodec.h:30,
             from kguimovie.cpp:44:
ffmpeg/libavutil/mathematics.h:32: error: comma at end of enumerator list
In file included from ffmpeg/libavcodec/avcodec.h:30,
             from kguimovie.cpp:44:
ffmpeg/libavutil/avutil.h:110: error: comma at end of enumerator list
In file included from kguimovie.cpp:44:
ffmpeg/libavcodec/avcodec.h:277: error: comma at end of enumerator list
ffmpeg/libavcodec/avcodec.h:303: error: comma at end of enumerator list
ffmpeg/libavcodec/avcodec.h:334: error: comma at end of enumerator list
ffmpeg/libavcodec/avcodec.h:345: error: comma at end of enumerator list
ffmpeg/libavcodec/avcodec.h:2249: warning: `ImgReSampleContext' is deprecated
(declared at ffmpeg/libavcodec/avcodec.h:2243)
ffmpeg/libavcodec/avcodec.h:2259: warning: `ImgReSampleContext' is deprecated
(declared at ffmpeg/libavcodec/avcodec.h:2243)
In file included from kguimovie.cpp:45:
ffmpeg/libavformat/avformat.h:262: error: comma at end of enumerator list
In file included from ffmpeg/libavformat/rtsp.h:26,
             from ffmpeg/libavformat/avformat.h:465,
             from kguimovie.cpp:45:
ffmpeg/libavformat/rtspcodes.h:38: error: comma at end of enumerator list
In file included from ffmpeg/libavformat/avformat.h:465,
             from kguimovie.cpp:45:
ffmpeg/libavformat/rtsp.h:32: error: comma at end of enumerator list
ffmpeg/libavformat/rtsp.h:69: error: comma at end of enumerator list

Solution

You could fix the headers and submit a patch to FFmpeg; compatibility with -pedantic is a worthy goal, so I'm sure they'd consider it, especially if it just involved removing trailing commas and suchlike.

Answered By - Mike F

Answer Checked By - David Marino (PHPFixing Volunteer)

[FIXED] Why are unused variables bad?

April 27, 2022 c, compiler-construction, gcc, llvm, warnings No comments

Issue

I would like to know why an unused variable is bad.

Is it because the compiler would create a bigger binary? If yes, is there a tool/script which can add an unused keyword or something like that?

Solution

The compiler gives you warnings to hint on things that could potentially be a problem or unintentional.

Unused variables will be optimized away most likely. But maybe you intended to do something with them – and in that case the compiler helpfully notes that you may have done something you didn't want.

What's the use in a variable you declare but neither read from nor write to?

Answered By - Joey

Answer Checked By - Candace Johnson (PHPFixing Volunteer)

Monday, September 26, 2022

[FIXED] What are the reasons for not hosting a compiler on a live server?

Issue

Solution

Wednesday, July 20, 2022

[FIXED] How does casting from integers to floating-point numbers work?

Issue

Solution

Sunday, July 17, 2022

[FIXED] How do I turn specific Delphi warnings and hints off?

Issue

Solution

Tuesday, June 28, 2022

[FIXED] How to build dominance frontier for control flow graph?

Issue

Solution

Sunday, June 26, 2022

[FIXED] How can I ignore GCC compiler 'pedantic' errors in external library headers?

Issue

Solution

Wednesday, April 27, 2022

[FIXED] Why are unused variables bad?

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Monday, September 26, 2022

Issue

Solution

Wednesday, July 20, 2022

Issue

Solution

Sunday, July 17, 2022

Issue

Solution

Tuesday, June 28, 2022

Issue

Solution

Sunday, June 26, 2022

Issue

Solution

Wednesday, April 27, 2022

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To